A KAIST research team led by Professor Min-Soo Kim developed a new way to process massive graphs with upward of a trillion edges without having to store the whole graph in the main memory. This method allows even single computers with limited memory to process graphs on a massive scale.

Graphs are a mathematical construct used to represent many real world objects, such as social networks or neural networks. Graphing algorithms often need to process large-scale graphs with more than a trillion edges. Since these large-scale graphs are impractical to store or share due to their sheer size, synthetic graphs — large-scale graphs created using certain parameters or magnified from smaller graphs —  are used instead.

Both methods of creating synthetic graphs require an entire large-scale graph to be loaded into the main memory of the graph processing engine. Tens to hundreds of costly computers are usually required during this process, which results in a financial challenge. To circumvent this problem, the research team decided to instead of generating the entire large-scale graph, they generate only smaller parts of the large-scale graphs that are necessary to run the algorithm. This method allows the results returned to be the same as if the large-scale graph has been generated entirely. The team showed that by using this technology, called T-GPS (Trillion-scale graph processing simulation), a single computer can process a graph of 1 trillion edges. The use of T-GPS uses 10,000 times less computing resources compared to conventional methods. Not only that, T-GPS completes the task 43 times faster than the conventional method because the communication needed between computers is reduced. “T-GPS can significantly increase both the scale and efficiency of developing a new graph algorithm.” Professor Kim says.

Copyright © The KAIST Herald Unauthorized reproduction, redistribution prohibited