Importing content from Sakai 2 into Sakai 3 (take 2)

Since returning from holiday, I have rejoined the matter of Importing content from Sakai 2 into Sakai 3. The first order of business was to refactor the XML parsing from SAX to StAX to deal with some potentially nasty classloader issues as suggested by Dr. Ian Boston. That went smoothly and I have to say that after using SAX, StAX is a much improved utility as you have more control and can pull events from the stream rather than having them pushed to you. This makes for more natural and supportable java code.

Next, there were some improvements that I wanted to make to the import code:

  1. Support for org.sakaiproject.content.types.urlResource types.
  2. Adding the metadata to the imported content.

After a week of plugging away on both fronts, some good progress has been made. First, a model conversion had to be considered for org.sakaiproject.content.types.urlResource types. In Sakai 2, these URL resources are simply presented in the UI as hyperlinks that open in a new window. Given the RESTful nature of Kernel2 (K2), I needed to decide how to best represent a hyperlink. My first thought was using the proxy capabilties of K2, but that presented some issues as proxy nodes must be stored under /var/proxy and the whole notion of proxying http requests has security implications – that is why K2 does not allow just anyone to create a proxy node.

I was probably too close to the problem and had trouble seeing the more obvious solution – why not use a http redirect? After noodling the problem for a while, the simpler solution finally entered my brain. After a bit of acking, I found that Sling already has support for redirects through its RedirectServlet which binds to sling:resourceType=sling:redirect. So then, it was just a fairly simple matter of creating a node, and setting the properties accordingly:

"jcr:created":"Wed Oct 07 2009 13:53:00 GMT-0400",
"jcr:lastModified":"Wed Oct 07 2009 13:53:00 GMT-0400",

That was pretty much it for org.sakaiproject.content.types.urlResource types; the redirect works as expected. There are still a couple of things I would like to improve in this area:

  1. The node names for these urlResources need to be beautified. As the resource name comes through in the content.xml, it looks like “http:__sakaiproject.org_”. I have to strip out the “:” to avoid a JCR exception, so the node name currently looks like “http__sakaiproject.org_”. Ideally, it would match the display name, i.e. “”. Perhaps some manner of escaping invalid characters might work, but further digging into the JCR is required. I am able to set “sakai:filename”:””, so maybe that is good enough; TBD.
  2. Since the jcr:primaryType==nt:unstructured, the URL is rendered as a folder when connected via WebDAV. It would be nice to get these URLs to render as a leaf node instead. I experimented with jcr:primaryType=nt:file, but ran into some roadblocks and backed off.

Regarding the mapping of metadata, that task proved to be mostly straightforward for the fields that have a one-to-one mapping. However, there are currently more supported metadata fields in Sakai2 than there are in Sakai3. There is no limitation on the number or type of metadata fields that can be stored in Sakai3, so I am considering just storing all of fields from Sakai2 just as a precaution and possible future-proofing. I am left wondering whether to store them with their current keys or to prepend something like “sakai2:” to all of the keys before storing them.

Looking towards the near term, I am likely to look into the following issues:

  1. All user uploaded content is currently stored in a BigStore under /_user/files. After discussing with Ian Boston, I will most likely refactor the import code to store its content in that BigStore as well. Although, the BigStore concept will likely be redesigned in the near future, so any work I do in this area will be nicely abstracted so that this behavior can be changed easily when and if BigStore is redesigned.
  2. With the move to BigStore, I will have to take a look at access control lists (ACLs) so the the user importing content will have the proper permissions.
  3. Next, I need to take a look at the contract between K2 and the “Content & Media” widget so that the imported content appears properly within the user interface.
  4. What about other content types that could be imported today? Content from the Forums tool may be a good candidate as K2 currently has support for threaded discussions. Chat might be another place to look… Other ideas?

Regarding the Sakai2+3 Hybrid mode, I have hopes to arrange a two day coding sprint with Dr. Chuck Severance and Noah Botimer to develop a BasicLTI consumer for Sakai3. This would allow us to easily place a Sakai2 tool within the context of a Sakai3 site. With any luck, we will get this sprint organized by the end of January. Until next time, L


1 Comment

Filed under Java, Sakai, Technology

Importing content from Sakai 2 into Sakai 3 (take 1)

Development was starting to slow down for me on the Sakai2+3 Hybrid Mode, so I needed to turn my primary focus elsewhere. Michael Korcuska and I had decided previously that the next focus point would be to develop a working prototype that would allow someone to take a zip file exported from Sakai 2’s Site Archive tool and import the content into Sakai 3. Initially the scope would be limited to just the content contained within the Resources tool (a.k.a. ContentHostingService) since Sakai 3 currently has enough functionality to support the files and folders model.

When I started down this path, I did not expect to reach a stopping point by the end of the week. Frankly I thought it would take longer. But after a couple of days, I had the logic around parsing the content.xml file and extracting the content into my local file system working pretty well. The next couple of days were spent porting this working code into Kernel2 as a SlingServlet and creating a RESTful web service. After a couple of bumps in the road and someone moving my cheese, I am pleased to say that the first iteration of this work is complete.

As an example, you can take the sample file which came from a Sakai 2 test instance, and upload it to Sakai 3:

curl -F"path=/site/import/folder" -F"" http://username:password@localhost:8080/foo.sitearchive.json

The web service expects two parameters:

  1. path: The path to a folder where you want the content imported.
  2. Filedata: one or more zip files to import.

The result will be a folder that looks like the following screen shot:

While there is still much to be done (e.g. mapping file meta-data, support for more resource types, etc.), this is an important first step. For one, it demonstrates technical feasibility. Secondly, it creates the beginnings of a framework that can be extended to support importing other Sakai 2 tools, and eventually other import formats entirely like IMS Common Cartridge. If you are interested in looking at the code, it can be found at my github repository.

Looking forward, I will likely begin investigating IMS Basic LTI as a mechanism to enhance the Sakai 2+3 Hybrid capabilities. Currently, the hybrid mode supports entire sites (i.e. the user chooses to enter either a Sakai 3 site or a Sakai 2 site via the Sakai 3 portal). Ideally, one should be able to mix and match tools from either Sakai 2 or 3 in a Sakai 3 site. Dr. Chuck has done some good work in this area – Sakai 2.7.0 will have both a BasicLTI consumer and producer. So theoretically, if Sakai 3 had a BasicLTI consumer, it could present a Sakai 2 tool to a user as a Sakai 3 widget. My hopes are that among Dr. Chuck, Noah Botimer, and myself that we could turn out a Sakai 3 LTI consumer relatively quickly. More to come in the new year. Best regards, L


Filed under Java, Sakai

maven2 bash completion complete

I have been utterly spoiled by bash completion when using svn and git for the past few months – the only thing that was missing was maven completion.  Since I could not sleep this morning, I set out to fix that.  First, a little bit of background.  I have been using MacPorts to install both subversion and git.  Both had a variant “+bash_completion” – I did not know what it did at the time, but it sounded cool so I included that variant when I installed them.

git-core @
subversion @1.6.5_0+bash_completion+no_bdb
For example: sudo port install git-core +bash_completion +doc

After digging a bit further, I figured out that I need to add the following lines to ~/.profile to get bash_completion to take off:

if [ -f /opt/local/etc/bash_completion ]; then
. /opt/local/etc/bash_completion

On the surface you might think that completion might only be aware of common command line arguments to svn and git binaries, but they are actually a little smarter. For example, in my git repository typing “git checkout <TAB>” will list all of the branches in the repository! Very handy!

So now, how to get maven2 commands into bash completion. I started with the first Google hit: Guide to Maven 2.x auto completion using BASH. That worked, but it was missing a lot of the commands I wanted easy access to and it was not obvious to me how to extend their script.  Next, Google lead me to another hit: Maven Tab Auto Completion in Bash. This script had more completions out of the box and it was obvious how to add more. With some quick hacking, my /opt/local/etc/bash_completion.d/m2 now looks like:

# Bash Maven2 completion
local cmds cur colonprefixes
cmds="clean validate compile test package integration-test \
verify install deploy test-compile site generate-sources \
process-sources generate-resources process-resources \
eclipse:eclipse eclipse:add-maven-repo eclipse:clean \
idea:idea -DartifactId= -DgroupId= -Dmaven.test.skip=true \
-Declipse.workspace= -DarchetypeArtifactId= \
netbeans-freeform:generate-netbeans-project \
tomcat:run tomcat:run-war tomcat:deploy \
sakai:deploy -Predeploy \
dependency:analyze dependency:resolve \
versions:display-dependency-updates versions:display-plugin-updates \
javadoc:aggregate javadoc:aggregate-jar \
# Work-around bash_completion issue where bash interprets a colon
# as a separator.
# Work-around borrowed from the darcs work-around for the same
# issue.
COMPREPLY=( $(compgen -W '$cmds' -- $cur))
local i=${#COMPREPLY[*]}
while [ $((--i)) -ge 0 ]; do
return 0
} &&
complete -F _mvn mvn

You will notice that I have added the common Sakai goals like sakai:deploy or -Predeploy. I have also added some other maven plugins that I find useful. Give it a try: “mvn <TAB><TAB>” or maybe “mvn sak<TAB>” or how about “mvn ecl<TAB>”. I hope you will find bash completion just as satisfying as I do.  Best, L


Filed under Java, Technology

Sneak Preview: Sakai 2+3 Hybrid pre-alpha

This video gives you the first glimpse of the hybrid integration between Sakai 2 and Sakai 3. This is a very early look and is not showing finished product, but instead enough of the user interface to begin real discussion and refinement.

Many thanks to: Ian Boston, Paul Bristow, Oszkar Nagy, Christian Vuerings.


Filed under Sakai

Investigating site exports from Sakai 2

So the export/import file format investigation has reached some early conclusions regarding the use of Moodle’s backup schema and it looks like we will be looking elsewhere. See: Moodle export-import format investigation and the email thread itself.

While we ponder IMS Commmon Cartridge, I thought I would investigate what it would take to provide the capability of exporting Sakai 2 sites into the existing Sakai 2 proprietary XML format. This is a long standing request within the Sakai community, but one that no one has been willing to tackle. This is a bit of a dodgy situation as most tools do participate in the method EntityTransferrer.transferCopyEntities(), so it is possible to copy the structure of a site from semester to semester. I use the term “structure” because this is common practice among LMS applications to only copy what might be termed a “template” across semesters. For example, this copy process would include content like forum definitions, but not student responses; grade book items, but not student grades, etc.  The primary use case is an instructor who taught a class last semester can import that previous site into the current semester’s course site to reduce setup time.

So far so good – but here is where things get a bit dodgy… The EntityTransferrer.transferCopyEntities() method copies entities directly from one site to another (i.e. without writing any of these entities to XML). While Sakai 2 does have a mechanism for writing entities to XML, called ArchiveService.archive(),there are at least two problems with it: 1) Unlike transferCopyEntities(), all student positngs, grades, etc. are included in the XML produced (i.e. more like  a site backup), and 2) only a small subset of tools actually implement the ArchiveService.archive() interface! So this leaves me wondering:

  1. Does anyone actually depend on ArchiveService.archive()? My instincts tell me no since most of the tools do not implement it. Am I wrong?
  2. Could we usurp the ArchiveService.archive() interface and change the behavior so that only site structure is exported without student content?
  3. Do we leave ArchiveService.archive() alone and create a new API?
  4. How many tools still need to implement archive()?

1 Comment

Filed under Java, Sakai

Sakai 3 export/import formats research

Since Sakai 3 is a ground-up rewrite, there is plenty of opportunities to rethink assumptions that have accumulated over the years.  One of those areas where a fresh look should do some good is in course export/import formats.  Sakai 2 has its own proprietary format which provides a “full fidelity” capability to move from one course site to another without losing anything.  While not losing anything is desirable, it comes at quite a cost; i.e. not being compatible with anything else on the planet.  This approach is not uncommon and I see now that Moodle also has its own proprietary format.

While there are some good open standards for course export/import (e.g. IMS Common Cartridge or SCORM), they will not provide a “full fidelity” export/import workflow where one could export from Sakai 2 and then import into Sakai 3; i.e. some information would get lost in the translation.  Now, supporting these open standards would have other important benefits, they are not first order solutions for simply getting from Sakai 2->3.  My hopes are that one day an open standard could be expressive enough to cover such a use case, but the innovation curve in the applications and tools may always exceed the ability of a lowest common denominator solution.

With that said, we thought it would be beneficial to see if Moodle’s export/import format provided enough capability to move data from Sakai 2->3 without losing any critical structure.  If this worked out, we would have one great feature out-of-the-box: the ability to move from a course from Moodle to Sakai and vice-versa.  It will be interesting to see how this works out and I will keep you updated as progress is made…

Leave a comment

Filed under Education, Sakai, Technology

Tags for OS X Users – Finding Stuff

I have been experimenting with some tools over the past six months to help me get better organized and increase the chances I can actually find something on my computer. I started my exploration with a program called Together from Reinvented Software.  Together was a good place for me to begin exploring the set of tags I would use and to incorporate tagging into my daily work-flows.  The things I liked about Together:

  1. It did not lock my content up into some bizarro binary file that could never be parsed again.  Instead, it neatly organized all of the files you added to Together in your Documents folder.  Seemed like a pretty reasonable thing.
  2. The tagging UI was pretty fast and had auto-complete.

These aspects of Together kept me using it for a few months until I discovered:

  1. The tags themselves were locked up in some bizarro binary file format! That did not sit well with me thinking about years and years of collected tags. While the program does have some way to sync those tags to Spotlight comments, the author warns that it will slow the application down and consume huge amounts of resources.
  2. The application became slow and unresponsive – even though I did not have the Spotlight-tag-syncing, resource-sucking option turned on.
  3. The work-flow of drag-and-drop became too cumbersome and I wanted something more streamlined.

I continued trudging through my use of Together, when one day the MacUpdate Promo had an application called Tags by Gravity Applications advertised for steep discount. It sounded like a good fit – maybe too good to be true – but the price was right and was worth trying to see if it met the hype. What I liked:

  1. Your tags are not locked away – they are part of the file’s metadata. This should allow my tags to travel as long as Spotlight is around.
  2. Spotlight can search for the tags – try a search like: tag:receipt apple.
  3. The UI is lightweight and fast. It may feel a bit unprofessional (just a personal opinion), but it cuts mustard.
  4. It is integrated with almost every application I use, and thus knows what file I am trying to tag without some cumbersome drag-and-drop work-flow. Very smart and efficient.

I am glad to say that I am still using Tags on a daily basis and have built some Automator work-flows around it as well. And… They recently provided me with an update to get everything working smoothly with Snow Leopard. Overall, I am very satisfied with this solution – I am able to find things reliably.  And after using this work-flow for awhile, it does strike me that this should be a base capability in OS X. I do find some evidence that Apple is headed down this path if you try the following in Snow Leopard: Print -> Save as PDF -> then play around in the Keywords field. I am seeing auto-complete. Are you?

1 Comment

Filed under Personal, Technology, Tools