VAST Challenge 2014: Mini-Challenge 3



IEEE VAST Challenge 2014 is open for submissions! The VAST Challenge uses the Precision Conference System (PCS) to handle the submission and reviewing process. PCS is available at https://precisionconference.com/~vgtc/. If you do not already have a login for the system you must register first. Once you are logged into your account please choose VAST 2014 Challenge under “new submissions” and follow the instructions.

Frequently-asked questions will be added at the bottom of this page as they are received. Check back frequently for updated information.

Server Outages

The server being used to stream the Mini-Challenge 3 data may experience occasional planned outages for maintenance purposes. Planned outages will be announced here.

There no outages currently planned.


Note: This scenario and all the people, places, groups, technologies, contained therein are fictitious. Any resemblance to real people, places, groups, or technologies is purely coincidental.

It is January 23, 2014, and the GAStech employees have been missing for three days. It is becoming increasingly urgent for law enforcement personnel to find the missing employees and return them home safely.

As an expert in visual analytics who is assigned to support law enforcement in this investigation, you have been asked to focus your efforts on maintaining situation awareness as events unfold. You have access to a single data stream containing two major sources:

  1. Microblog records that have been identified by automated filters as being potentially relevant to the ongoing incident
  2. Text transcripts of emergency dispatches by the Abila, Kronos local police and fire departments.

From this data, can you identify what is happening and help law enforcement personnel find the missing individuals?

You also have access to maps of Abila and background documents as well. (Note: these are the same materials provided in Mini-Challenge 1 and Mini-Challenge 2)

Use visual analytics to analyze the available data and develop responses to the questions to be provided. In addition, prepare a video that shows how you used visual analytics to solve this challenge. Submission instructions are available here.

Questions for Participants

Please note - this challenge contains a question that is time-dependent. Within 3 hours of starting the Segment 3 data stream, send an email to VASTChal2014MC3 at vacommunity.org containing your answer to question MC3.1. Please include a copy of your answer to MC3.1 in your final answer form also. Your answers to MC3.2 and MC3.3, along with your video, are due July 8.

The responses to these questions should be incrementally built, as you (the contestant) acquire information from each streaming data segment you receive. Your submission will answer these questions in consideration of all of the streaming data segments.

MC3.1 – Within 3 hours after start the Segment 3 data stream, send an email to VASTChal2014MC3 at vacommunity.org containing
  1. An image showing the streaming data in your visual analytics tool. In this image, identify an event of interest that you intend to investigate further.
  2. the content of the final message in the data stream

MC3.2 – Describe the timeline of up to five major events that you discover in the streaming data. This timeline should include information from all three segments of the data stream if needed. Use specific microblog records and call center data to support your description, but do not simply mimic back the data stream. Provide a concise description of important participants, locations and durations. Focus your response on the events themselves, rather than on the individuals reporting the events. Please limit your answer to no more than ten images and 1500 words.

MC3.3 – Select one of your five major events from question MC 3.2 that you consider to be most likely to provide additional clues to the investigation of the GASTech disappearances. Describe the roles of the participants. Describe how other events you identified in MC3.2 may have influenced your selected event. Provide a hypothesis and evidence as to whom you suspect as being directly involved in the GAStech disappearances, either as perpetrators or victims. Please limit your response to no more than five images and 500 words.

Available Data

The data for Mini-Challenge 3 is being released in three segments.

Segment 1 covers the time period from 1700 to 1830 Abila time on January 23. This data can be downloaded in comma-separated values (CSV) format or streamed from the data server multiple times.

Segment 2 covers the time period from 1830 to 2000 Abila time on January 23. This data can be streamed from the data server multiple times, but it is not available in a batch CSV file.

Segment 3 covers the time period from 2000 to shortly after 2130 Abila time on January 23. This data is available for streaming one time only. You will be asked to perform your analysis in real time as the data streams in.

The data stream includes two types of records.

Microblog messages (mbdata). The microblog messages included here use some of the conventions you may be familiar with from Twitter. The @ symbol is used to designate a username within the body of a message. Hashtags (#) are used to relate the message to specific topics. “RT” at the start of a message indicates that it the current user is re-sending another user’s message. Spam and junk messages are also common.

In this data stream, messages may be delivered in small batches, so it is possible for multiple messages from an individual to have the exact same date/timestamp. It is also possible for messages to be delivered slightly out of order. One major difference between these microblog messages and Twitter messages is that the messages in this stream may be up to 200 characters in length.

The format of the microblog messages is as follows:
  • date: in YYYYMMDDHHMMSS format. For example, July 8, 2014 at 11:59 pm would be 20140708235900.
  • author: author name, a text string
  • message: the message string, containing up to 200 characters of content
  • latitude: latitude from which the message was sent (optional)
  • longitude: longitude from which the message was sent (optional)

Call center data (ccdata). Emergency call center data is included in the data stream as appropriate. The format of these records is as follows:
  • date: in YYYYMMDDHHMMSS format. For example, July 8, 2014 at 11:59 pm would be 20140708235900.
  • message: the message string, containing up to 200 characters of content
  • location: the street or cross streets at which the incident took place (optional)

Registration and Downloads

Data are available now! Enter your email address here in order to download the background information, the Segment 1 data in CSV format, and the sample programs for connecting to the data stream.

When you are ready to begin working with the streaming data, send an email to VASTChal2014MC3 at vacommunity.org with your name, your institution, and the names of your other team members. You will receive a response that includes the name of the web server to connect to, a user ID to use for the web service connection, and a team number for use on your answer form. Please note that these responses are not automated, so it may take a day or two to receive a response.

Enter your email address below to download the background information, Segment 1 data, and sample programs.

Frequently Asked Questions

Question 1: Is it necessary to register for each data stream segment individually or does a single registration grant access to each segment?

Answer: Although there are 3 data stream segments, it is only necessary to register once. A single registration grants access to all of the data stream segments. Segment 3 can be streamed only once; the other segments can be streamed multiple times.

Question 2: I am trying to access the Mini-Challenge 3 data streams through a proxy server. Will this work?

Answer: Your proxy server must be configured to explicitly allow dedicated web socket traffic from the streaming data server. If not, a connection will not be successful.

Question 3: Please clarify what is needed for question MC3.1.

Answer: Please respond to question MC3.1 within 3 hours of starting the Segment 3 data stream. The goal of question MC3.1 is to show that you actually visualized the streaming data. In your response to question 3.1, please send a screen capture showing the streaming data in your visual analytics tool and include the text of the last message you received in the data stream. In addition, we would like you to identify an event of interest that you saw in the data, which we anticipate you will address in your later answers to MC3.2 and MC3.3. It is very important that you submit your answer to this question within 3 hours of starting the data stream for Segment 3, so that the reviewers know you analyzed the streaming data.

Question 4: Please clarify what is needed within the three hours and how much of what is shared at that point needs to be incorporated in the remaining MC3 questions submitted by the July 8 deadline. For example, can we include images that are substantially different to the one submitted within the three hours?

Answer: The goal of Mini-Challenge 3 is to encourage innovation in visual analytics for streaming data. With that in mind, the question MC3.1 is to illustrate your streaming visual analytics solution in action. What you submit for MC3.1 should be exactly the same as what you email in within three hours of starting Segment 3, as this is intended to provide evidence that you actually did stream the data into a visual analytics tool.

Images and supporting information you include for MC3.2 and MC3.3 can be different from what you provide for MC3.1. Your answers to MC3.2 and MC3.3 can reflect deeper thinking and investigation than you could perform in the moment while the data is streaming past.

We hope that people will iteratively build their tools and analyze the Segment 1 and 2 data, probably streaming it multiple times. Once their tool is solid and they have a good handle on the events in the Segment 1 and 2 data, then it would make sense to run Segment 3 and respond to MC3.1. After streaming the Segment 3 data, they probably will want to take more time to consider and develop answers to MC3.2 and 3.3.

Question 5: May we capture the streaming data and perform retrospective analysis on it?

Answer: There is nothing to prevent you from capturing the data and holding onto it for further analysis as it streams in, and this would likely be useful for answering the Grand Challenge. However, for MC3 we are hoping to see innovation in streaming data visual analytics, so we encourage you to emphasize innovation in that area.

Question 6: When is the data streams available? Do I have to access them at a specific time?

Answer: All of the data streams in mini-challenge 3 are accessible at your convenience. That is, you may start to download a stream whenever you are ready to process the information. Data Segment 1 and Data Segment 2 can be accessed and streamed to your client at any time, and as many times as you want. This should help you refine your stream analysis processes. Data Segment 3 can only be accessed and streamed to your client ONE time. It can be accessed at any time you select, but only once. Practicing with Segment 1 and Segment 2 should help you prepare for this.

Page last modified on Monday, June 23, 2014