See discussion on StorageRefactoring

Current state

Most of the checked stuff is not committed yet and still lives in TW's workdir. After my commit, do not expect the stuff to be useful for a production wiki, use it for playground wikis only.

means "more or less done". I am quite sure that it will take some time of fixing/tweaking/tuning until it is completely done.

implement FileSystemStorage as ../PagesAsBundles
storage API: ItemRevision, ItemBundle, Storage (== ItemBundle), LayeredStorage (with named layers, can be more than 2 - not sure if we ever use more than 2, though)
- caching module's stuff was moved to ItemBundle (mostly "as is").
  - we maybe need some more "long term" cache. I am not sure if we want a really persistent thing (meaning you MUST NOT deleted it - this is the kind of stuff that maybe should be in an Item, not in a cachelike dir). So we need a definition first.
- Meta data is represented as a dict of strings, directly representing the "key: value" lines in the meta file. Additionally, there are some convenience attributes on ItemRevision and ItemBundle level for accessing the most used dict members.
- Meta data is kept in memory as long as the object is alive. Content data is also kept in memory for the current revision and if size < 4000 (thinking of "data+overhead fits in one 4KB page").
- some simple (maybe too simple) locking functions for ItemBundle
- saving a new revision: lock itembundle (returns current meta), update bundle meta, update meta and data attributes of revision, call sync() of revision object, unlock itembundle (also setting new bundle meta)
Item class, derive Page from Item. Move (un)quoteWikiname* to Item class and lazy eval / cache it.
- currently, the existing Page class has an ItemBundle of the LayeredStorage
LogfileRefactoring
- edit-log file and code was removed, using rootitem's revisions meta files as data store for RC currently (very similar as the per-item storage for action=info). A change to the wiki (like saving a page or creating a new item) sort of creates a new revision of the wiki - and the wiki is represented at the rootitem. Until now, there is no corresponding revisions/xxxxxxxx data file in the rootitem.
  - long term, we could think of somewhat unifying action=info and RC. The main difference is scope and presentation, the storage layer is quite the same now. On the other hand, most of RC's code is handling presentation stuff, so we also can keep it separate.
- top level event-log is still there, not sure if it should stay there. Don't panic, I won't replace it in the same way as edit-log.
- log entries are dicts now, items addressable by name, easily extendable, easily readable files on disk.
make ACLs, extended names (free links) and sub pages mandatory. Remove config variables.
rewrite attachment stuff (UnifyPagesAndAttachments)
- mig script makes subitems out of attachments, even guesses mimetypes. As all items have revisioning, we have revisioned "attachments" now.
  - as items, subitems, subsubitems are still stored "flat" by the wiki, we have a rename problem again (nothing new, it is the same like with subpages, you have to rename them separately). Can be solved by implementing a hierarchical storage. The storage stuff is already prepared for this, the markup and rest of the wiki is not.
    - IMHO we should rather solve it by implementing an internal rename, maybe enhanced by a link fixer. We do not need to limit our algorithm to single rename. This should be protected by a distinct lock.
- for the mig script, existing attachments must inherit ACLs from parent page
- move action/AttachFile stuff into some handlers for those mimetypes
use internal http auth, use the user name as id, not the numerical user id (abandon using the uid cookie)
rewrite user classes to store data on an item UserName/AccountData
- mig script converts user accounts, trails and bookmarks that way and generates meta data with acl and mimetype. account data is stored in the data file, but in the same file format as the meta file also uses.
- mig script also saves a file with id -> name matching (same format as meta files), so we could adapt the uid cookie code to use that in case the http auth stuff takes longer
- move userform / user stuff into some handlers for that mimetype.
- Until we do that, we could edit most of the user account stuff using a text/plain handler. Changing the SHA1 password would be quite uncomfortable with that.
use handlers for showing/ editing / up/downloading / deleting / renaming / ... items of different mimetypes
1. base handler class for */* (doing like useful for application/octet-stream)
  - show: show meta infos and download link
  - raw: directly send content with correct http headers
  - edit: upload form
  - delete item form
  - rename item form
  - diff (show if different or not, maybe both meta data sets, size)
2. derive handler for image/*
  - show: also show image inline, maybe some image infos, thumbnail etc.
3. we also need to handle that twikidraw stuff somehow. this is a special problem as there are 3 files:
  - .draw - drawing, special mimetype (put in revision content)
  - .map - for imagemaps (put where? add to meta maybe?)
  - .png - rendered drawing for showing in browsers (put in cache/ ?)
  - or just use a tgz of all three?
    - You want to unpack the .tgz on page request? Do you sell CPUs?
4. derive base handler for text/*, derive more specific handlers for text/(whatever we have parsers/editors for)
  - move Page / PageEditor stuff into some handlers
  - Page "show" handler needs to be specific for text/X-moin-wiki / rest / etc.. Is that "Kill send_page(), quick!" finally?
  - But PageEditor stuff should be used as generic "edit" handler for text/* items, at least as fallback, if there is no more specific handler.

Ideas collection

Storage types

We can have different of storage types - but to the wiki code it should not matter what is the type. All types will implement the same interface.

FileSystemStorage - the current solution (with PageAsBundles)
- this is the first one to do - and only if it has proven to be ok, there can be more, like the following.
WikiRpcStorage - this is the second, enabling distributed wikis
BDBStorage - fast, compatible, easy, flexible, built-in.
MercurialStorage, SVNStorage, CVSStorage, RCSStorage, TlaStorage - whoever needs that will have to implement that, good revisioning build in
MySQLStorage, PostgreSQLStorage - whoever needs that will have to implement that (lots of overhead)
WebDavStorage - webdav also includes revisionning in the spec

MoinMoin: StorageRefactoring/State (last edited 2007-10-29 19:08:43 by localhost)