[o:XML] SrcML project

Martin Klang martin at hack.org
Mon Dec 20 15:37:00 UTC 2004


On 17 Dec 2004, at 12:57, Frank Raiser wrote:
> Hello,

Hi Frank, thanks for posting to the list -

> As SrcML is more of a framework there are several modules included
> and I suggest you check them all out from CVS.

I found I had to download jars antlr.jar and dom4j-full.jar to build.
Also I got compilation errors compiling some of the modules, eg 
parser-shared (after compiling config, api and util successfully).
I'll find the top-level build.xml file and try that.

> From what we've seen in the programming guide for o:XML it appears as
> if o:XML is highly based on XSL with f.ex procedures being used instead
> of the XSL templates. Of course this might be a bit blunt, but from 
> that
> introduction we can not yet see the big difference.

There's an introductory article on XML.com which may interest you ->
http://www.xml.com/pub/a/2004/07/21/oxml.html
Though for the purpose of examining MLML, you can disregard o:XML 
completely.

> SrcML is essentially a framework. It does have an intermediary code
> format specified in our DTD. Though it is not really intermediate, but
> rather an underlying format. The data is kept as XML and worked with 
> it,
> but it is not visible to the user/developer most of the times. It's 
> also
> not intermediate, because we have tools working on SrcML and producing
> SrcML as output again.

Framework for application development, similar to an IDE? Or framework 
for code manipulation?
I believe that the data format, whether you wish to call it 
intermediary or not, is really important (understatement).
It's the basis that all other tools and utilities have to operate on, 
be they visualisers, analysers or processors. That's why it must 
contain the right information at the right level of abstraction.

> I don't really see a difference between the switch statement in Java
> and C. Do you mind explaining this point a little further?

C switch statements allow loop unrolling in rather bizarre ways (as for 
example in Duff's device), afaik this is not possible in Java.
(googling - found it: 
http://java.sun.com/docs/books/jls/first_edition/html/14.doc.html#35518 
)
The difference is more syntactic than semantic, but important 
nonetheless.
Anyhow, just an example of language differences, an issue which you've 
identified yourselves with SrcML.

>> As a possible solution that I've recently considered, the specific
>> constructs of any language could be represented in a generic way, eg:
>> <mlml:statement name="java:switch">
...
> The drawback turned out to be too heavy for our goals. One of the main
> goals of the SrcML framework is to allow developers to create tools 
> which
> can work on different languages, without (in the most idealistic case)
> even knowing what language it is working with. This however turns out 
> to
> be impossible when every single construct is only available in a 
> language
> dependent namespace.

The 'infoset' of MLML is:
- Types (classes)
- Variables and variable references (fields)
- Functions and function calls (static and dynamic, oo and procedural) 
(methods)
- Operators and operations (expressions)
- instructions and statements (as per above - defined language specific 
statements)

This allows quite extensive analysis, for example for Aspect-Oriented 
or meta-programming, without having to look inside any 
language-specific constructs. For example, all possible branches 
leading to reassignment of a particular type field can be found with an 
XPath expression.
It's not in itself complete, but the goal of MLML is to create a 
'language of languages' rather than a language superset.
Of course the required level of abstraction will be different depending 
on any particular use, so the code format will necessarily be a 
compromise between different needs.

> If you're interested in this approach you can take a look at the
> languages module of SrcML which provides a means of dynamically gather
> information about a language at runtime

I'll try to put some time aside to spend with SrcML. I'm very 
interested in your experiences and the work you've done.

> We do intend to work with our result documents, but
> when transforming a program from functional to procedural by such means
> we will end up with an unreadable mostly auto-generated version of the
> original code which does not really reflect the intentions of the 
> original
> code.  ...
> This is a prime
> reason why we do not plan on having any such conversion process at all.

Auto-generated or transformed code is not a problem, unless you're 
intending to transform the program back to source format and expect it 
to still be readable and intelligible. Is that a design goal of SrcML?


/m


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


More information about the o-xml mailing list