Observation based data model

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Observation based data model

Jordan Ilott-2

I'm working to develop an application for data exploration and I would like the data model I use in Chaco to reflect that my data is indexed by observation. An example of this type of data is the classic iris data set( http://en.m.wikipedia.org/wiki/Iris_flower_data_set). I would like to generate interactive scatter plot matrices and also observation indexed line charts of individual variables. In my mind, this requires that all data be contained in a single data source. I've implemented some basic tests using pandas in a plotdatasource, but as this is basically just a helper class, it doesn't seem suitable for the purpose.

Has anyone done anything like this? Does anybody have suggestions for implementing this? I was considering a new datasource that uses pandas and modifications or extension s to the scatter and line plots to support named access to the data source. I look forward to any input or suggestions.

Jordan


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Observation based data model

Eraldo Pomponi
Dear Jordan,

I'm working to develop an application for data exploration and I would like the data model I use in Chaco to reflect that my data is indexed by observation. An example of this type of data is the classic iris data set( http://en.m.wikipedia.org/wiki/Iris_flower_data_set). I would like to generate interactive scatter plot matrices and also observation indexed line charts of individual variables. In my mind, this requires that all data be contained in a single data source. I've implemented some basic tests using pandas in a plotdatasource, but as this is basically just a helper class, it doesn't seem suitable for the purpose.

Has anyone done anything like this? Does anybody have suggestions for implementing this? I was considering a new datasource that uses pandas and modifications or extension s to the scatter and line plots to support named access to the data source. I look forward to any input or suggestions.


Some times ago Corran gave us a wonderful example of Traits/TraitsUI/Chaco (no pandas) capability in this post (using the Iris dataset):


It could be a good starting point for you. 

HTH

Cheers,
Eraldo 


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Observation based data model

Peter Wang-2
In reply to this post by Jordan Ilott-2
On Wed, Feb 20, 2013 at 9:14 AM, Jordan Ilott <[hidden email]> wrote:

Has anyone done anything like this? Does anybody have suggestions for implementing this? I was considering a new datasource that uses pandas and modifications or extension s to the scatter and line plots to support named access to the data source. I look forward to any input or suggestions.


A while ago, for the new plotting system I'm working on, I created a cheap-and-cheerful little prototype which cloned some of the ggplot syntax for doing faceted plots, backed by Chaco:


Which produces this image: http://i.imgur.com/m0XCTco.png

Since Bokeh is mainly focusing on HTML-based output for the time being, and we've switched to a different declarative syntax for plots specification than the ggplot style seen in the above demo, the chaco_gg stuff is not really being worked on right now. However, the code that's there does still work, and the code in https://github.com/ContinuumIO/Bokeh/blob/master/bokeh/chaco_gg/ggplot.py#L226 might be of some use for you.

-Peter
 

_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Observation based data model

Jordan Ilott-2
Peter,

Thanks for your suggestions. I had discovered your work previously, in fact, I've been using the PandasDatasource class as a data model for Chaco.

So far, I've created a scatter plot matrix class that creates datasources and ranges for columns in the pandas dataframe. My code then creates a scatter plot renderer for each combination of the dataframe columns and adds it to a grid container. By reusing the datasources and the ranges, I've been able to get some level of brushing and coordinated zooming and panning working without doing anything too fancy(just hooking up the tools in the usual way). I will have to use some trait change event handling to get the selection(brushing) to work with every plot in the matrix though.

Once the basic interactive features are implemented I want to start looking at applying multiple filters(based on selections) to include or exclude results. I also need to be able to select and group observations into more than one group. I think this means applying multiple selection tools and somehow switching between which is active but not changing what may have already been brushed. I think it's add, remove and enable/disable tools dynamically, but I'm not sure that it's easy to create selection groups. From what I've learned so far, I think that the selection tools are always using the same names(selection and hover) in the datasource metadata so I'm not sure how best to implement these features. I'm planning to share this code once it's done; however, I am learning python and Chaco at the same time so it might be a little rough around the edges.

Jordan

On Wed, Feb 20, 2013 at 5:20 PM, Peter Wang <[hidden email]> wrote:
On Wed, Feb 20, 2013 at 9:14 AM, Jordan Ilott <[hidden email]> wrote:

Has anyone done anything like this? Does anybody have suggestions for implementing this? I was considering a new datasource that uses pandas and modifications or extension s to the scatter and line plots to support named access to the data source. I look forward to any input or suggestions.


A while ago, for the new plotting system I'm working on, I created a cheap-and-cheerful little prototype which cloned some of the ggplot syntax for doing faceted plots, backed by Chaco:


Which produces this image: http://i.imgur.com/m0XCTco.png

Since Bokeh is mainly focusing on HTML-based output for the time being, and we've switched to a different declarative syntax for plots specification than the ggplot style seen in the above demo, the chaco_gg stuff is not really being worked on right now. However, the code that's there does still work, and the code in https://github.com/ContinuumIO/Bokeh/blob/master/bokeh/chaco_gg/ggplot.py#L226 might be of some use for you.

-Peter
 

_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Observation based data model

Peter Wang-2
On Wed, Feb 20, 2013 at 7:59 PM, Jordan Ilott <[hidden email]> wrote:
Peter,
Thanks for your suggestions. I had discovered your work previously, in fact, I've been using the PandasDatasource class as a data model for Chaco. 

Great.
 
Once the basic interactive features are implemented I want to start looking at applying multiple filters(based on selections) to include or exclude results. I also need to be able to select and group observations into more than one group. I think this means applying multiple selection tools and somehow switching between which is active but not changing what may have already been brushed. I think it's add, remove and enable/disable tools dynamically, but I'm not sure that it's easy to create selection groups. From what I've learned so far, I think that the selection tools are always using the same names(selection and hover) in the datasource metadata so I'm not sure how best to implement these features.

OK, I've just submitted a pull request to make this easier for you.  (https://github.com/enthought/chaco/pull/96)  Once the PR gets accepted, you should be able to set the 'metadata_name' trait of the different selection tools.  By syncing these up between the plots which should have linked tools, you can have different, unrelated sets of selection tools, looking at differently named metadata keys on the datasource.

HTH,
Peter
 

_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Observation based data model

Robert Kern
On Thu, Feb 21, 2013 at 2:44 AM, Peter Wang <[hidden email]> wrote:

> On Wed, Feb 20, 2013 at 7:59 PM, Jordan Ilott <[hidden email]> wrote:
>>
>> Peter,
>> Thanks for your suggestions. I had discovered your work previously, in
>> fact, I've been using the PandasDatasource class as a data model for Chaco.
>
> Great.
>
>> Once the basic interactive features are implemented I want to start
>> looking at applying multiple filters(based on selections) to include or
>> exclude results. I also need to be able to select and group observations
>> into more than one group. I think this means applying multiple selection
>> tools and somehow switching between which is active but not changing what
>> may have already been brushed. I think it's add, remove and enable/disable
>> tools dynamically, but I'm not sure that it's easy to create selection
>> groups. From what I've learned so far, I think that the selection tools are
>> always using the same names(selection and hover) in the datasource metadata
>> so I'm not sure how best to implement these features.
>
> OK, I've just submitted a pull request to make this easier for you.
> (https://github.com/enthought/chaco/pull/96)  Once the PR gets accepted, you
> should be able to set the 'metadata_name' trait of the different selection
> tools.  By syncing these up between the plots which should have linked
> tools, you can have different, unrelated sets of selection tools, looking at
> differently named metadata keys on the datasource.

Merged. Thanks!

--
Robert Kern
Enthought
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev