Network Analysis and Visualisation
Coordinator: Seok-Hee Hong, University of Sydney and NICTA
Scope of the domain
Recent technological advances produce a lot of data, and have led to many large and complex network models in many application domains; examples include:
- Webgraphs: They are huge: the whole graph consists of billions of nodes.
- Software engineering: Large-scale software engineering deals with very large sets of software modules and relationships between them. Analysis of such networks is essential for design, performance tuning, and refactoring legacy code.
- Biological networks: Protein-protein interaction (PPI) networks, metabolic pathways, gene regulatory networks and phylogenetic networks are used by biologists to analyse and engineer biochemical materials. In general they have up to thousands of nodes. However, their relationship is very complex.
- Social networks: These include telephone call graphs (used to trace terrorists), money movement networks (used to detect money laundering), and sexual contact graphs (used to analyse epidemics). The size of the network can be medium to very large.
Understanding these networks is a key enabler for many applications.
Good analysis methods are needed for these networks, and some are available. However, such methods are not useful unless the results are effectively communicated to humans. Visualisation can be an effective tool for the understanding of such networks. Good visualisation reveals the hidden structure of the networks and amplifies human understanding, thus leading to new insights, new findings and possible predictions for the future.
A critical issue for both analysis and visualisation is scale. Existing methods do not scale well enough to be effective on current data sets. Data sets such as telephone call graphs, and protein-protein interaction networks are growing at a rate that is considerably faster than our ability to gain an understanding of them. Existing analysis and visualisation methods fail to deal with data complexity, for real world data sets. Further, human perception and cognition is limited. For example, when using large mega-pixel displays, the human brain still fails to deal with the visual complexity. Visualisation researchers need to reduce the data set to overcome visual complexity.
In this task force, we will gather researchers from different disciplines and domains such as computer scientists, information systems researchers, sociologists, psychologists and biologists to initiate collaborative research in analysis and visualisation for large and complex networks.
The main outcome of the taskforce will be continuing and cross-disciplinary research with a common theme of analysis and visualisation for large and complex networks.
Significance
Analysis and visualisation of large and complex networks is a challenging research topic. For example, the current hot topics in Social Network Analysis are scale-free networks (at the fundamental level) and terrorist networks (at the application level). More specifically we can identify the following research significance:
Scalability: the main challenge of this research is scalability. For example, webgraphs or telephone call graphs gathered by AT&T have billions of nodes. Does it make sense to visualise the whole graph? The use of large display methods provides a partial solution. In some cases, it is impossible to visualise the whole graph. In many cases, one cannot possibly load the whole graph in a main memory. Hence, the design of new analysis and visualisation methods for huge networks is a key research challenge for research from databases to computer graphics.
Complexity: the second challenge for this taskforce is complexity. Relationships between actors in a social network, for example, can have a multitude of attributes (for example, observed behavior can be "confirmed" or "unconfirmed", relationships can be directed or undirected, and weighted by probabilities). Also, biological networks are quite complex in nature; metabolic pathways have only a few thousand nodes, but their relationships and interactions are very complex; for example, the data may be "given" by nature, but some parts of the data may be "unknown" to human scientists. The design of analysis and visualisation methods to resolve these complexity issues is the second research challenge.
Integration of visualisation with analysis: Analysis tools for networks are not useful without visualisation, and visualisation tools are not useful unless they are linked to analysis. This integration of analysis and visualisation of large and complex networks will be the third research challenge.
Network Dynamics: Real world networks are always changing over time. Many social networks, such as webgraphs, evolve relatively slowly over time. In some cases, such as telephone call networks, the data is a very fast-streamed graph. Effective and efficient modeling, analysis and visualisation for dynamic networks are challenging research topics.
Targeted objectives and specific deliverables
Objectives:
- We will identify research opportunities in Network Analysis and Visualisation for the EII network, focusing on the Australian context.
- We will form a research community with cross-disciplinary collaboration, including computer science, information systems, mathematics, statistics, biology and sociology, with a focus on problems in analysis and visualisation of networks.
- We will spawn continuing cross-disciplinary collaboration (that is, continuing after the lifetime of this taskforce).
- We will assist emerging researchers to find support for research in Network Analysis and Visualisation. This involves helping with linkages to international researchers, industrial funding, and ARC Discovery-style grants.
- The main long-term objective is to spawn a viable high-end industry in software to handle large and complex networks.
Deliverables:
- Two intensive courses: one on Network Analysis, one on Network Visualisation. These will be provided by a combination of overseas and Australian researchers.
- A research workshop on Network Analysis and Visualisation.
- A book on Network Analysis and Visualisation. The book will be written collaboratively by 8-10 early career researchers, as well as some PhD students. Each will contribute a chapter. This will be an extensive state-of-art survey in Network Analysis and Visualisation.
- A web resource on Network Analysis and Visualisation, with a collection of papers and data sets, tools and small research projects
- A number of applications to funding bodies (ARC Discovery, ARC Linkage, and industrial).
Impact for EII network
The short-term impact of this task force will be an Australian research effort to solve problems in the analysis and visualisation of large and complex networks. The aim is to involve early career researchers, and to assist them in academic and industrial research. For the long term, we hope that this task force will be the beginning of an Australian industry in providing software solutions to these problems.
Core Participants
The followings are core participants from EII Network:
- Seokhee Hong (USyd/NICTA), main organizer
- Peter Eades (USyd/NICTA), co-organizer
- Kai Xu (NICTA), co-organizer
- Frank Dehne (Griffith)
- George Havas (UQ)
- Weifa Liang (ANU)
- Xuemin Lin (UNSW)
- Aaron Quigley (UC Dublin)
- Raymond Wong (UNSW)
- Yanchun Zhang (Victoria)
- Bing Bing Zhou (Usyd)
- Albert Zomaya (Usyd)
- Chengqi Zhang (UTS)
- Phoebe Chen (Deakin)
- Xue Li (UQ)
- Pearl Pu (Swiss Federal Institute of Technology)
- Masahiro Takatsuka (Usyd)
Other core participants outside EII Network include:
- Data Mining: Sanjay Chawla (Usyd)
- Information System: Byunggu Choi (Usyd)
- Network Optimisation: Hiroshi Nagamochi (Kyoto U)
- Network Theory: Miro Kraetzel (DSTO)
- Sociology/Psychology: Pip Pattison, Garry Robins (Umelb)
- Biology: Peter Little, Rohan Williams (UNSW)
