Posts Tagged ‘java’

Pentaho Business Intelligence Plug-in

Tuesday, June 1st, 2010

I hi-light this project because it demonstrates my ability to quickly grok a large unfamiliar codebase even with little documentation and to make meaningful modifications and contributions to that code. In this case I wrote a Java based plug-in for an open source Business Intelligence Suite by Pentaho Corporation. Grokking the internals of this powerful system was non-trivial but was aided by my experience as the designer of D2K, another data-flow RAD environment for data-integration and data-mining.

Intellibadge | Tracking a Major Conference

Monday, May 31st, 2010

“IntelliBadgeâ„¢, an NCSA experimental technology, is an academic experiment that uses smart technology to track participants at major public events. IntelliBadgeâ„¢ was first publicly showcased at SC-2002, the world’s premier supercomputing conference, in the Baltimore Convention Center. This was the first time that radio frequency tracking technology, database management/mining, real-time information visualizations and interactive web/kiosk application technologies fused into operational integrated system and production at a major public conference.”

I was involved in this project at myriad levels, including setup and administration of linux machines, integration of my collaborative video streaming software, multi-threading the application, mirroring code for this P2P architecture, and developing post-conference data analytics with custom Java-based software.

D2K – Datamining Infrastructure

Monday, May 31st, 2010

I was the orginal architect and author of this 100% Java data-mining system.  Once known as D2K (Data to Knowledge), this system was most fundamentally a model for designing custom data-mining solutions.  It was as well a rapid application development environment for the development of those solutions with a powerful run-time environment. I wrote the original prototype for D2K while working for the Automated Learning Group at NCSA.  Tom Redman (from the Mosaic project) would soon join the team to create the interface and RAD component of the system.  David Tcheng’s ideas were the intellectual foundations of many of the algorithms implemented within the system.  I did 2 more major rewrites of the infrastructure during my time in the ALG during which time this small research group grew from 3 to well more than a dozen people increasingly focused on some aspect of D2K.  D2K quickly turned into a flagship effort of NCSA and certainly of ALG and subsequently become the central tool for a startup company specializing in real-time analytics: River Glass.  There is now a project underway to develop a next generation evolution of this software, a semantic-driven system called Meandre of which I am only an interested observer.

Caterpillar – Reliability Information System

Sunday, May 30th, 2010

My collaborations with the research and engineering staff at the Engine Division of Caterpillar Corporation began at NCSA and later extended to more targeted goals under an independent consulting relationship. Notably, I designed and implemented the central algorithm and java class structure for Caterpillar’s next-generation Reliability Information System. This work, based primarily on the Weibull distribution, was inserted into their existing quality assurance processes. Until then, this process still involved significant manual efforts – engineers working with calculator, pen and paper.