<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.26.0">
</HEAD>
<BODY>
On Sat, 2010-05-22 at 23:19 +0200, Richard B. Kreckel wrote:
<BLOCKQUOTE TYPE=CITE>
<PRE>
Hi!
jros wrote:
> Although I don't have a solution for your problem, as I'm myself
> addressing similar problems
> matching common subexpressions to variables, in down top manner, I think
> that such a functionality
> is implicitly implemented in GiNaC.
>
> If I understand GiNaC internal structure correctly, subexpressions
> common to two expresions,
> are frequently shared internally, to save memory.
This is entirely correct, but...
> So it must be possible to write a print algorithm that goes trough
> an/some expression/s tree, and that replaces
> every shared subexpression (let say sum product) with a variable, that
> again is assigned a expression that would be printed
> in the same way using recursion.
...first, this sharing is entirely transparent for the user...
</PRE>
</BLOCKQUOTE>
You mean that we can not look to the smart ptr of a expression, or some of <BR>
its subexpressions like add and mul to see if there are referenced by more than one<BR>
expression.<BR>
<BR>
The idea would be: if an element of a subexpression is referenced more than once we can<BR>
call it atom, and printout (for C code) the expression using the atom (avoiding expansion of the subexpression),<BR>
and also print atom definition (for C code).<BR>
<BR>
It would be nice to be able to print expresions in GiNaC to see th level of sharing that it is using.<BR>
<BR>
Is sharing also enforced when solving linear equations?. I think this is a really important place for optimization if done.
<BLOCKQUOTE TYPE=CITE>
<PRE>
> Probably allowing/disallowing some kind automatic simplifications (so
> that subexpression sharing expected value increases) can probably help
> to obtain improved results.
...and second, sharing is currently not pursued aggressively! Rather, it
is exploited whenever it is trivial to do so. (The product rule of
differentiation is an example where exactly the same terms pop up again
and again so exploiting sharing comes at no cost.)
</PRE>
</BLOCKQUOTE>
It would be nice if sharing aggressiveness could be changed at runtime. without affecting performance for<BR>
at least for the level of aggressiveness of the actual implementation.<BR>
<BR>
In this example<BR>
<BR>
e1=a+b<BR>
e2=a+b+c<BR>
<BR>
I suppose that no sharing is implied like e2=e1+c, that would be computationally expensive. But, if instead<BR>
<BR>
e2=e1+c<BR>
<BR>
then e1 is referenced by e2?? (I suppose yes).<BR>
<BR>
If this is the case I consider that the level of sharing is respectably good :) . I mean, just defining the<BR>
expresions with care would give good results.<BR>
<BR>
The sharing of expressions when diff is really nice.<BR>
<BR>
If parenthesization is kept to a maximum (no expansions made if they are not needed), as I think is the case,<BR>
the sharing structure is kept as long as possible, and that is very good also.<BR>
<BR>
So, in my opinion using, what I understand is, the current level of sharing, I mean no changes at all, would allow<BR>
to print C output code in a very optimized way.<BR>
<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
> I wonder what do the developers think about this.
Well, I think that if the size of generated code is so prohibitively
large and compiler CSE doesn't help you may be better off writing your
own algorithm collecting similar terms in an associative array. You
could then artificially introduce temporaries, in order to help the
compiler. This would boil down to a more aggressive variant of GiNaC's
subexpression sharing. What do you think?
</PRE>
</BLOCKQUOTE>
<BR>
I suppose you refer to <A HREF="http://en.wikipedia.org/wiki/Associative_array">http://en.wikipedia.org/wiki/Associative_array</A> (I'm not familiar with this).<BR>
What would be the "key" and the "value"??.<BR>
<BR>
I think that you propose to go beyond the level of sharing of GiNaC, trying to find common expresions that are not shared by GiNaC. Do you?<BR>
<BR>
So you start traversing the structure and you push in the Associative array whatever subexpression you find,<BR>
the "key" will be the subexpression and the "value" a new defined symbol for the atom (and may be the number of references to this atom), then you substitute the subexpression in the expression with the new atom.<BR>
If the subexpression was already in the array, then no new insertion is made and the subexpression is substituted for the matching atom (the value).<BR>
<BR>
I suppose that to push a subexpression, you first (recursively) apply recursively the previous recipe to it (so it gets atomized before pushing).<BR>
<BR>
I suppose that a important issue is to decide when a expression should be consider atomized, and I think that this is dependent of the internal structure of GiNaC, for example:<BR>
<BR>
If a - b *( c +d +e) *f is a expression.<BR>
<BR>
atom1= c + d + e<BR>
<BR>
atom2=-b*atom1*f<BR>
<BR>
atom3=a + atom2<BR>
<BR>
I suppose that after this, if the expression<BR>
<BR>
b *( c +d +e) *f<BR>
<BR>
is atomized, a new atom will appear<BR>
<BR>
atom4=b*atom1*f<BR>
<BR>
Instead of avoiding the creation of an atom having into account b *( c +d +e) *f=-atom2<BR>
<BR>
I suppose that this would need things like, inserting the subexpression, only if it or its negative are not in the array.<BR>
<BR>
Nevertheless it seems to me that sharing in GiNaC automatically deals with this (or almost).<BR>
<BR>
Fine grained atomization like,<BR>
<BR>
atom1=c+d<BR>
atom2=atom1+e<BR>
<BR>
seems more difficult and inefficient to implement, due to the internal expression representation.<BR>
<BR>
In my particular implementation, each time a operation is done the result is atomized, and all the expresions are kept atomized.<BR>
To that end I define the special type atom (descent of symbol), that have a pointer to my equivalent (although less optimal) implementation of the associative array.<BR>
<BR>
This special type allows things like implementing making diff print to work flawlessly with atomized expresions (as if they were not). As I dare not overloading all the GiNaC operators, I only overloading the operators of a matrix class in which all the operations that I need are made.<BR>
<BR>
This implementation (certainly improvable), is optimal in some senses (single representation, maximum sharing an minimum memory, low cost of atomization), nevertheless<BR>
fine tunning needs a deep knowledge of the internals of GiNaC.<BR>
<BR>
Thank you very much,<BR>
<BR>
Javier<BR>
<BR>
<BR>
<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
Bye
-richy.
</PRE>
</BLOCKQUOTE>
</BODY>
</HTML>