DecouplingCyphesis

From WorldForgeWiki
Jump to: navigation, search

Description

This is my (tomva's) page to track how to decouple cyphesis.

By "decouple", I mean get it to a point where it is easy to run/build a cyphesis binary quite separate from the default worldforge config. The motivation for this is that then it will be easier for world creators to create their own configuration.

For the moment, this is mostly a journal of my attempts. Once I get things working I'll try to go back and put it into a more helpful walkthrough form.


Links

See WorldCreation for some walkthroughs of how to change world configuration using the default WorldForge configuration for cyphesis/cyclient/ember.

See DecouplingUseCases for some use cases I intend to support with my decoupling efforts.

See DecouplingEmber for a similar effort on the client.


Feature/Bugfix Requests

This is the list of feature and/or bugfix requests to which I keep adding as I work on decoupling. I may be able to make many of the changes myself, but of course it would be cool if other WorldForge developers wanted to tackle any of these.

I'm trying to keep these in priority order, with the highest priority on top.

Cyphesis List

  1. WorldForge libraries should always build their static libraries, even if they dynamically link by default. Rebuilding a static cyphesis should be a single rebuild, not a deep rebuild of all dependencies. I actually wonder if static builds shouldn't just be the default.
  2. Need clear error message when startup Python files aren't found. At the moment, you get "SCRIPT_ERROR" messages which aren't very informative.
  3. Atlas should use a real XML parsing engine! Probably a stream-based parser like libexpat. At the moment we have the worst of both worlds: XML's excessive verbosity but no support for basic features like XML comments.
  4. cyphesis shouldn't lock up when it can't find startup Python files. It can just exit. (may be a problem/feature of the Python libraries?) You can reproduce this by just copying cyphesis to some random filesystem directory and running it without any arguments. cyphesis hits 100% cpu and stays there.
  5. The --cyphesis:confdir argument should be used literally with no modifications. That is, cyphesis should not append a "/cyphesis" to the end of the path. In general, binaries shouldn't make assumptions about where files are located. Ideally, the binary should accept the path to a master config file on the command line, and all other config can be reached by starting from the master config file. See the "ideal" directory structure below.

Log

Here is the log of what I'm doing.

Decoupling Config Files

  1. Copy cyphesis on its own to a clean machine. I've got another machine running FC5, and it has never had any WorldForge files installed. I've copied cyphesis into a directory structure of my own making there, to see what the minimal steps are to get it running.
  2. I tried running cyphesis from there, but it complained that it couldn't find its config file. I copied over the base cyphesis config file (cyphesis.vconf from etc/cyphesis) and ran again using the --cyphesis:confdir=DIR argument.
  3. I hit one small snag: the command-line argument isn't authoritative: cyphesis will automatically add a "/cyphesis" sub-directory to whatever you pass in (bug listed above). So I updated my local directory structure to accommodate that.
  4. Next error: cyphesis wanted a tmp directory (fair enough!). Added one to the command line (the --cyphesis:vardir parameter).
  5. At this point cyphesis locked up and I had to kill it from another terminal (Ctrl-C wouldn't work). I think it was because cyphesis couldn't find the necessary python scripts, although the error messages weren't very descriptive.
  6. I wanted to rebuild a cyphesis with more logging, but then I ran into dynamic linking issues. I didn't want to copy over all the shared libraries, so I attempted to rebuild cyphesis statically. That was surprisingly difficult! I had to rebuild all of the dependent libraries :( I've added that as a feature request: libraries should build their static libraries always, since it is cheap to do and saves this sort of rebuild.
  7. Finally (after many rebuilds) I had an instrumented, static cyphesis build! It was clear there were multiple problems:
    1. When initializing python, there were errors looking for "hooks".
    2. When creating the WorldRouter, there were complaints about the "statistics.Statistics" python module
    3. Then there was a final message saying "Unable to open rule file", at which point cyphesis pegged the CPU
  8. I copied over all of the rulesets, and the mason.xml and basic.xml rule files. At this point, cyphesis was running. Now I needed to start stripping content out.
  9. Getting cyclient to run was simple: I just copied over the cyclient binary! It was able to use the existing config for cyphesis
  10. Furthermore, I was able to run [ember] on a remote machine and connect to this cyphesis server with no problems.

Here is what the current directory structure looks like (directories only, no files shown here):

  cyphesis
     bin
     etc
        cyphesis
     share
        cyphesis
           rulesets
              basic
              mason
     var
        tmp


This is what I wish it looked like:

  cyphesis
     bin
     etc
     rulesets
        basic
        mason
     var

That is, the "cyphesis" subdirectory names are redundant, as is "share". The simpler directory structure isn't possible right now (I think), because cyphesis has built into it some expectations about paths. But this is a pretty good start.

Now: working on the etc directory. Need to transform the xml files into a canonical format easier for humans/parsers. The basic.xml and mason.xml files will be compiled from a set of smaller, simpler specification files.


Atlas XML vs. Canonical Format

Actually, transforming the xml to a simpler format may not be worth it! I was able to write a quick reverse compiler, but the idiosyncracies of the Atlas format will be an issue. Here's an example: an apple tree from Mason.

Here's the raw Atlas version (from mason.xml):

 <map>
   <map name="attributes">
     <map name="bbox">
       <list name="default">
         <float>-0.4</float>
         <float>-0.4</float>
         <float>0</float>
         <float>-0.1</float>
         <float>-0.1</float>
         <float>5</float>
       </list>
       <string name="visibility">public</string>
     </map>
     <map name="burn_speed">
       <float name="default">0.025</float>
       <string name="visibility">public</string>
     </map>
     <map name="fruitChance">
       <int name="default">15</int>
       <string name="visibility">public</string>
     </map>
     <map name="fruitName">
       <string name="default">apple</string>
       <string name="visibility">public</string>
     </map>
     <map name="fruits">
       <int name="default">3</int>
       <string name="visibility">public</string>
     </map>
     <map name="mass">
       <float name="default">3000</float>
       <string name="visibility">public</string>
     </map>
     <map name="maxmass">
       <float name="default">5000</float>
       <string name="visibility">public</string>
     </map>
   </map>
   <string name="id">appletree</string>
   <string name="objtype">class</string>
   <list name="parents">
     <string>tree</string>
   </list>
 </map>

And here's the reverse-compiled version (into a made-up canonical format):

   entity {
       attributes {
           bbox {
               default [
                   value       "float:-0.4"
                   value       "float:-0.4"
                   value       "float:0"
                   value       "float:-0.1"
                   value       "float:-0.1"
                   value       "float:5"
               ]
               visibility  "string:public"
           }
           burn_speed {
               default     "float:0.025"
               visibility  "string:public"
           }
           fruitChance {
               default     "int:15"
               visibility  "string:public"
           }
           fruitName {
               default     "string:apple"
               visibility  "string:public"
           }
           fruits {
               default     "int:3"
               visibility  "string:public"
           }
           mass {
               default     "float:3000"
               visibility  "string:public"
           }
           maxmass {
               default     "float:5000"
               visibility  "string:public"
           }
       }
       id          "string:appletree"
       objtype     "string:class"
       parents [
           value       "string:tree"
       ]
   }

I greatly prefer the made-up canonical version for hand-editing and reading. All of the XML fluff is stripped out, so it is easier to see the underlying structure. However, it remains 1-to-1 with XML, that is, it is trivial to map back to the XML version.

But even in the stripped-down canonical format, the idiosyncracies are there. For instance, Atlas has "map" and "list" constructs--why? I'm guessing these are wire format details, but they mean the file data format (xml or whatever) has to include both. And why are bounding boxes lists and not maps? I think it would be far safer to use a map, so you could label the points as "x0=-5, y0=-5, ..." etc. Instead, the point labels are inferred based on ordering. Relying on implicit ordering is dangerous!

I was thinking that a canonical format would insulate the user from this. For instance, in the made-up format, a bounding box could look like this:

  bbox {
     x0  -0.4
     y0  -0.4
     z0  0
     x1  -0.1
     y1  -0.1
     z1  5      
  }

That matches more what a human expects from a bounding box! But converting back to the Atlas format is fraught with peril. For instance, how does a generic canonical --> XML converter know this should be turned into a list? And how does it know the value types are floats, and not ints or strings? That sort of logic would require logic like "if this is a bbox element then...", and that's not a maintainable direction.

Conclusion: any sort of world-construction toolset should probably use the Atlas XML format as its native metadata structure. I may still use a canonical format for my own work, and transform back and forth, but anyone building world entities needs to be aware of the Atlas XML details.

If I had it to do over, I'd make the Atlas XML format much simpler. The type information probably isn't needed at the networking or persistence layer: everything should just be strings (and therefore the format need contain no type information at all). And instead of maps and lists, just have multimaps (like XML), and multimaps would be implicit, not explicit. (Multimaps are maps that allow multiple entries with the same key name). And atomic elements could use attributes.

Then the apple tree XML could look something like:

 <entity>
   <attributes>
     <bbox visibility="public">
       <x0>-0.4</x0> <y0>-0.4</y0> <z0>0.0</z0>
       <x1>-0.1</x1> <y1>-0.1</y1> <z1>5</z1>
     </bbox>
     <burn_speed visibility="public">0.025</burn_speed>
     <fruitChance visibility="public">15</fruitChance>
     <fruitName visibility="public">apple</fruitName>
     <fruits visibility="public">3</fruits>
     <mass visibility="public">3000</mass>
     <maxmass visibility="public">5000</maxmass>
   </attributes>
   <id>appletree</id>
   <objtype>class</objtype>
   <parents>
     <parent>tree</parent>
   </parents>
 </entity>

I think that's better about using XML in a domain-specific way, and is easier to read (IMHO). It's also much more compact! But this probably isn't easy to retrofit.


Primitive Compiler

To start, I'm going to work on a simple compiler which, given a starting directory structure containing all sorts of atomic elements like models, behaviors, terrain, etc., compiles just the server config files.

How is that starting directory structure actually populated? In the future, I imagine there will be packages people can share that contain these elements (models, scripts, etc.), and some sort of package importer will add the package data to the directory structure.

For now, I'll just focus on starting with an already-populated directory structure and generating server-only output.


Compiler Notes

The World Compiler is getting more and more functional. At this point, it is able to build the ruleset xml files, the cyphesis vconf file, and the define_world.py file (just the terrain). The ruleset xml file is the most interesting, since it requires parsing all of the classes that make up the ruleset, and each class can reference behavior (python scripts). Most of that is working.

Running the server with compiled data is NOT working yet, primarily because I still don't know enough about the cyphesis startup process. At the moment I'm having problems with ruleset hooks. Hopefully, there is a simple way to get the hooks to load at startup. If it requires hard-coded knowledge about locations (for instance, the basic ruleset has to be there) then I'll have to deal with that in the World Compiler.

My first milestone is the ability to compile a working server (all config and scripts). Once that works, getting the client package should be fairly straightforward.

A note on entity description formats In the end, I went back to my own format. An entity (like the Wolf or Settler class) is a composite object, with some attributes but links to more primitive objects such as behaviors and 3D models. So the world compiler needs to construct XML on the fly anyway. In that case, there isn't a need to keep the exact same syntax, so I went for something much simpler.

For instance, this is the code that declares the wolf:

 id              wolf
 parent          character
 bounding-box {
       x0      -0.6
       y0      -0.26
       z0      0
       x1      1.3
       y1      0.26
       z1      1
 }
 mass            20
 maxmass         30
 mind            mind.animals.WolfMind
 modeldef        wolf

Compare that to the XML:

 <map>
   <map name="attributes">
     <map name="bbox">
       <list name="default">
         <float>-0.6</float>
         <float>-0.26</float>
         <float>0</float>
         <float>1.3</float>
         <float>0.26</float>
         <float>1</float>
       </list>
       <string name="visibility">public</string>
     </map>
     <map name="mass">
       <float name="default">20</float>
       <string name="visibility">public</string>
     </map>
     <map name="maxmass">
       <float name="default">30</float>
       <string name="visibility">private</string>
     </map>
   </map>
   <string name="id">wolf</string>
   <map name="mind">
     <string name="language">python</string>
     <string name="name">mind.WolfMind</string>
   </map>
   <string name="objtype">class</string>
   <list name="parents">
     <string>character</string>
   </list>
 </map>


Success!

Finally today (18 sep), after several days of poking around, I got my first primitive world to work. This is a cyphesis installation completely compiled from scratch!

A note on "compiling". I do not mean that I compiled the binary executables (cyphesis and cyclient). I mean that I was able to start with some primitive, low-level descriptions of objects and behaviors, and produce a directory structure all set up so that cyphesis could run and support those objects. And the output includes pre-compiled cyphesis and cyclient binaries. The output (compiled) directory structure is packageable (for instance, tar and zip) so that anyone could download and run it locally.

Here is the minimum set of files I found was necessary to run (that is, this is the compiled output):

cyphesis/
|-- bin
|   |-- cyclient
|   `-- cyphesis
|-- etc
|   `-- cyphesis
|       |-- antares.xml
|       `-- cyphesis.vconf
|-- share
|   `-- cyphesis
|       `-- rulesets
|           `-- antares
|               |-- __init__.py
|               |-- define_world.py
|               |-- editor.py
|               |-- hooks
|               |   |-- __init__.py
|               |   |-- ruleset_import_hooks.py
|               |-- mind
|               |   |-- __init__.py
|               |   `-- panlingua
|               |       |-- __init__.py
|               |       |-- interlinguish.py
|               |       |-- panlingua.py    
|                `-- world
|                   |-- __init__.py
|                   `-- statistics
|                       |-- Statistics.py
|                       |-- __init__.py
`-- var

And here was the content tree (source directory structure) from which that was generated:

content/
|-- behaviors
|   `-- core
|       |-- Statistics.behavior
|       |-- Statistics.py
|       |-- editor.behavior
|       |-- editor.py
|       |-- interlinguish.behavior
|       |-- interlinguish.py
|       |-- panlingua.behavior
|       |-- panlingua.py
|       |-- ruleset_import_hooks.behavior
|       `-- ruleset_import_hooks.py
|-- bin
|   |-- cyclient
|   `-- cyphesis
|-- classes
|   |-- mobile.class
|   |-- settler.class
|-- master.txt
`-- terrain.dat

I chose to call the ruleset "antares" for no good reason other than that I wanted a name that was obviously different from "mason" or "basic" (so I could smoke out any dependency issues).

The "core" behaviors are those scripts which are necessary to run. The Statistics.py file must be there (hard-coded dependency in cyphesis). The ruleset_import_hooks.py file must also be there so cyphesis can import modules at all. The editor, interlinguish, and panlingua files are required only because my autogenerated define_world.py file needs them.

The .behavior files tell how the local script is mapped in the ruleset. For instance, the panlingua.behavior file says that the ID of this script should be mind.panglingua.panlingua, its language is python, and its local path is core/panlingua.py. So .behavior files should also work for other future supported logic plugin formats such as .so's.

The terrain.dat file is just a list of i,j,z values.

Here is the mobile.class file in its entirety:

 id              mobile
 parent          character

And the settler.class file:

 id              settler
 parent          mobile
 playable        1

In fact, I could have just had settler inherit from character directly, and avoided the mobile class. I used it because that's how mason.xml was set up, but other playable classes do not inherit from mobile.

The master.txt file is the starting point of world compilation, it provides the high-level directions. Here is the current version:

#
# master.txt
#
# This is a sample file to demonstrate what a WorldForge world's
# master config file looks like.  This tells the worldCompiler
# how to compile usable server + client packages.
#
# Format is pretty simple: the first token on each row is
# the key, and everything else is the value (leading whitespace
# is removed).
#
# NOTE: pretty much any of the values specified in this file
# can be overridden on the command line.  For instance, you
# could override the terrain file name by passing
# --terrain-file=<different-path>
# on the command line to worldCompiler.
#
# There is one restriction: all paths must be relative to the
# content-directory root path.  That applies to command-line 
# specified paths as well.


# What is the world name?
world-name              Antares


# what is the ruleset name?
ruleset-name            antares


# specify the terrain file to use.  The terrain file is a text
# file that contains rows in "i j z" format (see Mercator documentation)
# path is assumed to be relative to root of content directory
terrain-file            terrain.dat


# where are the binaries to be packaged?
cyphesis-bin            bin/cyphesis
cyclient-bin            bin/cyclient

A lot of comments! There are only 5 active lines:

  • set the name of the world ("Antares")
  • set the name of the ruleset ("antares")
  • point to the terrain file containing topography ("terrain.dat")
  • point to the binary to use for cyphesis
  • point to the binary to use for cyclient

All other information is determined by recursively walking the content directory. For instance, the antares.xml file is composed by finding all .class files and parsing them to construct classes (rules). And all .behavior files are parsed, and corresponding behaviors (python scripts) are copied over to their proper locations in the output directory structure (following the ID --> path rules).

I'll add a bit more to the server (for instance, I want to make sure I include some classes with AI behaviors == mind scripts).

Then I'll work on generating the client side of things, and in particular getting models, modeldefinitions, and modelmappings correct.

Once I have client and server packages being successfully compiled (and running!) from a single content source tree it will be in a publishable alpha state.

I have started on user documentation at WorldCompiler, although this isn't released yet.