[Opensource] Future Code for Expresso 5.1 and beyond [LONG THINK TANK ARTICLE]

Michael Rimov rimovm at centercomp.com
Sat Nov 2 20:20:27 PST 2002


Hey All,

I've been giving some serious thoughts to the next release of Expresso (or 
two).  Some of this will appear in the next release, but by and large offer 
backwards compatible solutions to your current programs. [Much in the same 
way that 4.02 -> 5.0 has some issues, but nothing serious].  We will 
@deprecate certain ways of doing things for eventual removal.  But the goal 
here is to take things one step at a time to ease the transition.

Furthermore, I haven't fully decided which version each of these features 
will show up in.  Our goal is to have a 5.1 release with a couple of new 
major features, so we can keep people up to date at a decent pace without 
stalling for a long period of time between releases.  Our current road map 
goal (if I remember correctly :) ) is to work some on ConfigManager 
refactoring, but it may be that we need to do some basic reorganization 
that we can do in steps between each release.

Ok, on to the future designs:

I have not met anybody on the list that was not for better componentization 
of Expresso elements.  While the Schema is a nice 'high level' component, 
it leaves much to be desired when it comes to swapping out, for example, 
JNDI connection pools for DBConnectionPool, or a JCache implementation for 
CacheManager.   Right now Expresso is flat not geared to be able to do 
this.  There is a lot of hard coded stuff, that is causing tight coupling. 
[If I got the cohesion vs coupling terminology right <grin>]

The other issue, I have is the number of jar files required to run even a 
command line program. In all the workings, I don't like the idea of 
including 15+ jar files in the CLASSPATH just to run a command line program 
that will connect to a database.

And finally, a common complaint is that Expresso by and large is not 
reloadable in the sense that you have to start and stop the Servlet engine 
every time.  There isn't any significant runtime reconfiguration around.

SO!  Let's discuss where to go:

I've been doing some look at how Apache Avalon and some other frameworks do 
things, and have been coming up with my own ideas:

1 - Most component metadata should be removed from code and stored in 
configurations [whether database, XML files, property files, 
etc].  Examples of such metadata are:
		-dbobject.setupFields()
		-Configuration settings.
		-Controller state definitions.
		-Schema definitions.

Advantages are:

-Faster development: By changing definition files, you can reload the 
webapp without recompiling, and restarting the servlet container.
-Better self-documentation.  By using stylesheets, you can easily turn the 
metadata into a self-generated documentation set.
-Potential management capabilities.  Anything that is moved to properties 
files you could potentially allow for runtime management through interfaces 
such as JMX.
-Easier "code generation"/tools support: instead of having 'wizards' 
generate entire java files, they could more easily write out an xml 
file.  People are more likely to write XML file parsers, as evidenced by 
such things as the Struts console application.
-Easier introspection. If we write, for example, a built-in component 
installer.  The xml metadata could be parsed to show how to configure the 
system... otherwise, you have to reload a system, load the class files and 
introspect the classes.  More difficult IMO.

Potential Disadvantages:
-Runtime configuration is a potential security issue.  We have to keep a 
watch on what kind of configuration options we have to prevent attackers 
from making, for example, a socket connection, and changing everything so 
they have free access.  This is an exaggerated example, of course, but it's 
still a possibility and should be kept watch over.

-Slower startup times.  Parsing XML is more CPU intensive than loading a 
pre-compiled class.  How much of a difference this will make, I'm not 
sure.  But it's certainly something to keep in mind as we optimize.

-Anything else that somebody can think of?




2 - Object instances to components and services should be retrieved through 
a "ServiceLocator."  The concept will be similar to DNS, or looking up an 
object stored in JNDI.  In fact, the ideal concept is that the appropriate 
ServiceLocator could query a JNDI server for object instances.  The default 
implementation will not do so since Expresso does not currently require a 
JNDI server.

3 - Each named service that we create will have a specific Interface rather 
than a class that we issue calls to.  This allows us to clearly define the 
system services and allows people to substitute their own implementation 
and only write a wrapper class that translates calls to the Interface to 
the underlying implementation.

4 - I'd like to see the concept of SCOPE applied.  For example, some 
services will be available in a GLOBAL concept.  DBConnectionPool would be 
a good example of this.  However, it would significantly help reduce 
"service name" collisions if for your own applications you have specific 
components that you only want available to your own applications.  It would 
also IMO help from the security standpoint.

Ok, so that's what I see from a 'top level' concept of what I'm looking 
at.  Now let's get some specifics.  The metadata model, I'm borrowing 
partly from the Struts 1.1 form definition, where you can define a form, 
it's type, and each attribute's name without having to write a new bean class.

----------------------------------------------------------------------------
XMLSchema:

Here is a rough draft of a new metadata file for a schema.  This would be 
pretty easy to add in that in SchemaList, instead of entering the classname 
of the Schema, we simply enter the path to the xml file that would be 
loaded with loadResource().  We then decide if it's an xml file or a class 
name and load the class name or load XMLSchema and give it the filename 
parameter.  The listing below is more of a pseudocode for what I have in 
mind than a full blown xml file.

<component name="">
	<title></title>
	<version-info>
		<major-version>4</major-version>
		<minor-version>0</minor-version>
		<micro-version>0</micro-version>
	</version-info>
	<message-bundles>
		<message-bundle>/com/jcorporate/expresso/core/Messages.properties</message-bundle> 

	</message-bundles>
	<components>
		<dbobjects></dbobjects>
		<controllers></controllers>
		<jobs></jobs>
		<reports></reports>
	</components>
	<services></services>
	<component-configuration>
		<scope="component|service|global"/>
		<property name="" description="" type="" default=""/>
		<array-property name="" type="" description="">
			<array-value=""/>
		</array-property>
	</component-configuration>
	<requires>
		<required-schema>
		</required-schema>
	</requires>
</component>

As you can see, the concept of the current Schema is ported straight 
across, although it's named a component now.  Some potential initial or 
future expansions in this include the ability to support arrays of 
properties.  The other thing I added was for the possibility for a Schema 
to have multiple Message Bundles with it. The reason I'm looking at this is 
that currently if you derive from a controller, you must find all the 
messages associated with it, and sometimes that can be a huge task.  By 
subdividing the message bundles, you could sometimes make a cleaner task.

We could go so far as define a Controller's Metadata, Job's Metadata, and 
each DBObject's Metadata in the component definition.  I'm half leaning 
towards this since I'd like a central location to control each schema 
without having all these xml files scattered everywhere.
-------------------------------------------------------------------------
Future of DBObjects:

I've been already working on a DataObject API since Expresso 5, and the 
roots of it are in place.  There's a goal that I have to have a series of 
async DataObject APIs.  However, that is a long term goal for Expresso 6, 
and I'm only "baby stepping" in that direction so we don't lose 
anybody.  I'm at the point where we, as a community, need to make a 
decision on how we want field access.  There are a few options:

Pseudocode:

	DBObject.getField().asString()/.asDate()/.asInteger()  etc.

		or
	(Date)DBObject.getField()/(String)DBObject.getField() etc.
		or

	DBObject.getFieldString()/.getFieldInt()/.getFieldDecimal()

Maybe it's preference, but I don't really like the 3rd method, even though 
that's what we're doing right now.  The biggest problem is that it is part 
of the reason we have this monster of a DBObject class right now.  By 
delegating conversion code to the field object (as in #1) we slim down 
DBObject.   #2 is easier to code in that you make DBObject a dynabean and 
leave it up to the coder to know what is what.... but I think it might lead 
to weird class cast exceptions.  So I'm currently leaning towards #1 as my 
own personal preference.

Now keep in mind that the finalization of this API won't be probably for a 
year, so as far as rewriting code goes, you won't have to do it 
immediately.  What are your guy's thoughts on this?  Which API method would 
you prefer to see?  If you want it as is
-----------------------------------------------------------------------
Job Handler:

Ok, so we've all pretty much concluded that JobHandler needs some 
work.  I'd honestly like to see refactoring take a couple of steps:

STEP 1:  Make some interfaces to Job Handler that allows code to directly 
control the Job Queue, such as initiate shutdown, queue a job 
etc.  Ideally, we wouldn't use the JobQueue DBObjects at all from the 
'initiate a job' standpoint.

STEP 2: Get a good reliable Job Handler that runs in a separate 
thread.  Having Job Handler run in a separate VM currently can cause all 
sorts of issues with caching.  So let's not even try it first off.  Let's 
boost reliability in-VM.  Issues are:
		Clean Shutdown
		Track down the final JobHandler/DBObject race conditions.
		Reliably run Job Scheduling.
		

STEP 3: Work on an out of VM Job Handler.  AFAICT, this will require 
something like RMI to provide decent IPC communication to the Job 
Handler.  [Or JMS if we want multiple Job Servers].  This would DEFINITELY 
be a replacement component for the in-VM Job Handler since it would require 
additional servers such as RMI Registry or JMS server.  But we can work on 
it without affecting the current code base because we've componentized 
JobHandler as listed in step 1.

Any thoughts on this??
---------------------------------------------------------------------
	
DBMaint - We've got a rework of the DBMaint UI slated for a couple of .1 
releases in the future.  I suggest that we consider making a series of 
mockups and having the community hash out what we like best and why.

I'm working on having the DataObject Interfaces completed in such a way so 
the DBMaint can work on any object complying with the DataObject 
interface.  In particular, this would allow DBMaint to work with 
MultiDBObjects AS WELL AS DBObjects.  I really consider this a high 
priority, especially as we better normalize our tables in such things as 
User management/Registration ,etc.

----------------------------------------------------------------------------
Security

The are a few things that I would like to see happen in this area:

1 - Componentize a security manager.  It will allow people to better plug 
in their own security systems such as LDAP back-ends etc.

2 - Better integrate with Java Security API.  For example, the Current User 
login, should have a java.security.Principal representing the current 
user.  This will give us the ability to eventually fully integrate with JAAS.

3 - By componentizing, I also want to see a "plug-in" capability, so 
designers can plug in custom security checks, such as custom eForum 
security.  But this way we have a unified API.

4 - Expresso currently has the pattern of a base class that provides the 
basic API, and a subclass that handles the security.  Examples 
are:  Controller/DBController and DBObject/SecuredDBObject.  I would 
frankly like to merge these in so that there is only, for example, a 
Controller object.  Programmers can, however, set the security manager 
component for each of the objects, so if a programmer wants the equivilant 
of a DBObject, rather than a SecuredDBObject, they'd call something like:
		DBobject.setSecurityManager(NullSecurityManager);

And null security manager would specifically allow everything through 
without security checks.

-----------------------------------------------------------------------

Ok, well that's all that's coming from this brain dump.  If anybody wants 
to comment on the directions I would love to hear from them.  As I said at 
the beginning of the email.  Not all of these changes are going into 
5.1!  I'm going to be focusing on Configuration and Component Management in 
5.1, and then we'll continue the work in 5.2, etc.  You will also see 
continued work on the Data Object API, so I would REALLY like to hear how 
you want to retrieve field values so I can determine the basic usage 
pattern in the API before continuing.

Thanks for reading! I look forward to your comments!
						-Mike





More information about the Opensource mailing list