The environment of my Strigi-chemical project is very dynamic: Strigi is under heavy development and undergoing XESAM-reforms at the moment, there is also much happening in open source cheminformatics world, like new OpenBabel release and things like OSRA optical structure recognition or structural information embedded in PNG images. This post outlines the final TODO of my project which you can expect to be ready by the end of this SoC.
- JStreams SDF substream provider is a nice feature which represents an SDF file as a virtual folder of MOL files. I'm trying to make it work stable at the moment. This makes SDF analysis transparent as if it really was a set of MOL files. It also allows to browse the contents of SDF with jstream:// KIO;
- I need to fix CML2 analyzer and make sure it correctly recognizes the newly generated CMLs from Jerome's chemical-structures-2 repository;
- Use MIME helper to decide whether to continue with stream analysis or to skip the stream. It will probably allow to optimize some greedy analyzers;
- Make sure that OpenBabel helper works well and does not crash with parallel analyzers;
- Create a chemical PNG analyzer to extract chemical information (MOL or InChI) from images; Unittests are based on samples generated by Firefly and gchempaint software;
- Make sure all strigi-chemical analyzers conform Strigi PLUGIN architecture and have unittests with files taken from Blue Obelisk CTFR (Chemical Test File Repository)
- Update chemical ontology to prepare it for migration to XESAM ontology (fix types, cardinalities, child-parent links and indexing flags);
- And finally, build a sample GUI application, using molsKetch and avogadro as Kparts. This application should be able to input a structure by drawing it, represent it as OpenBabel molecule, convert it to InChI using OB, make a XESAM query over dbus to strigidaemon, display the list of results with a structural preview powered by Avogadro (Kalzium).
It is quite a lot of work for 5 days left, but I have drafts of everything listed above so it should be feasible.
0 Komentar