Data Visualization through Crowdsourcing: Canvas Software

Battelle Number: 30644 | N/A

Technology Overview

Across business and government, the volume of data is growing at an exponential rate. Analysts must employ computer tools and techniques to quickly navigate and classify the vast and complex information. A key technique often used is visualization, but the sheer scale of many data streams makes it challenging to efficiently analyze data, and some visualization tools struggle with heterogeneous data, such as pictures and images. In addition, most analyses start from scratch, forcing each user to work in isolation. Pacific Northwest National Laboratory’s Canvas software uses the concept of crowdsourcing to leverage the work of others and build a learning computer system to analyze more efficiently and effectively.

Canvas is built on the concepts of context and machine learning. In the software, users access data, visualized as individual colored dots on a two-dimensional graph. As users arrange the data on the graph, they implicitly communicate their latent mental model and provide the necessary context, so the software can “learn.” Canvas uses information theoretic measures to determine which features each user employed (for example, height, recency, financial risk) to arrange the data. Canvas then subtly perturbs the positions on the screen to suggest arrangements that increase the mutual information between the key features and positions. For example, if one user arranges data about disease instances by location and another arranges data by virality, Canvas may arrange by frequency, helping to highlight the possible start of an epidemic. The software can also suggest positions for new data that have not been considered by the user.

Having Canvas remember these arrangements increases the value of the data for other users. So, by effectively “crowdsourcing” new, salient features, Canvas can help users to leverage data to more rapidly establish new models and address additional queries.

Applicability

Canvas can be used wherever multiple analysts or decision makers interact with data over time. It also provides a unique application for creating human-in-the-loop exploration of machine learning models trained with very few training points. Examples include health care (disease identification and management) and product recommendations (finding similar objects of interest to a consumer).

Advantages

  • Enables use of pictures and images, which stymy many analytical programs 
  • Continually learns, leading to faster analyses with greater insights
  • Leverages the power of analytical thinking across users to identify data trends and relationships

Availability

Available for licensing in all fields

Keywords

data visualization, machine learning, analysis, analyses, data analysis, Canvas, crowdsourcing information, deep learning, neural networks

Portfolio

DS-Machine Learning/AI

Market Sectors

Data Sciences