Introduction

This tutorial will give you a brief explanation about how to handle the OCD Service. It explains a few basic requests for e.g. uploading a graph, retrieving the saved graphs, creating a cover from a graph etc.

However, this is only a fraction of what the service actually allows. Other functionalities include for instance the execution of metrics to determine the quality of the calculated covers or the computation of benchmark graphs and covers for the generation of standardized testing data. At the bottom, an overview of all possible requests is given. For a more detailed description of the requests you should refer to the OCD Service’s source code.

All responses other than actual graph and cover data (which has the specified output format), as well as JSON visualization data are sent in XML format.

For more detailed information about using WebOCD Service, refer to the tutorials, which explains usage of WebOCD Service with the WebClient

Graphs

List Available Graphs

GET graphs

Returns the ids (or meta information) of multiple graphs.

Optional query parameters are:

  • firstIndex: The result list index of the first id to return. Defaults to 0.
  • length: The (maximum) number of ids to return. Defaults to Long.MAX_VALUE.
  • includeMeta: If TRUE, instead of the ids the META XML of each graph is returned. Defaults to FALSE.
  • executionStatuses: If set only those graphs are returned whose creation method has one of the given ExecutionStatus names. This is only relevant with respect to graphs created by benchmark models. Allowed values are WAITING (graph computation has not started yet), RUNNING (graph computation is in progress), COMPLETED (graph computation has finished) and ERROR (An error has occured during graph computation). Uploaded graphs will always have the execution status COMPLETED. Multiple status names are separated with the “-“ delimiter. Defaults to the empty string which is equivalent to all statuses.

Upload

POST graphs

Uploads a graph. The data describing the graph is sent in the request body in accordance with the input format. Returns the graph id.

Optional query parameters are:

  • name: Any arbitrary String to name the graph. Defaults to unnamed.
  • creationType: Meta information describing based on what the graph was created. Allowed values are REAL_WORLD (based on real world data), NEWMAN (based on the Newman Benchmark Model), LFR (based on the LFR Benchmark Model) and UNDEFINED. Defaults to UNDEFINED.
  • inputFormat: The name of the graph input format. Allowed values are GRAPH_ML, WEIGHTED_EDGE_LIST, UNWEIGHTED_EDGE_LIST, NODE_WEIGHTED_EDGE_LIST, GML, NODE_CONTENT_EDGE_LIST and XML as well as XGMML. Note that only some of these format allow to pass edge weights. Defaults to GRAHP_ML.
  • doMakeUndirected: Optional query parameter. Defines whether directed edges shall be turned into undirected edges (TRUE) or not (FALSE). Defaults to FALSE.
  • startDate: The start date of the data rows which will take as a graph input.
  • endDate: The end date of the data rows which will take as a graph input.
  • indexPath: This is a path of the index for making graph faster.
  • filePath: This parameter is only for Stack Exchange XML data. This is mandatory for xml data file.
  • body: This will be the text of open source developer or learning forums data which would be a plain text file. This is mandatory for Node Content Edge List Graph Input.

Retrieve

GET graphs/{graphId}

Returns the graph with id {graphId} in accordance with the specified output format.

Optional query parameters are:

  • ouputFormat: The name of the graph output format. Allowed values are GRAPH_ML, WEIGHTED_EDGE_LIST and META_XML. Defaults to GRAPH_ML.

Delete

DELETE graphs/{graphId}

Deletes the graph with id {graphId} and all covers belonging to it. Returns a confirmation.

Covers

List Available Covers

GET covers

Returns the ids (or meta information) of multiple covers.

Optional query parameters are:

  • firstIndex: The result list index of the first id to return. Defaults to 0.
  • length: The number of ids to return. Defaults to Long.MAX_VALUE.
  • includeMeta: If TRUE, instead of the ids the META XML of each graph is returned. Defaults to FALSE.
  • executionStatuses: If set only those covers are returned whose creation method status corresponds to one of the given ExecutionStatus names. Allowed values are WAITING (cover computation has not started yet), RUNNING (cover computation is in progress), COMPLETED (cover computation has finished) and ERROR (An error has occured during cover computation). Uploaded covers will always have the execution status COMPLETED. Multiple status names are separated using the “-“ delimiter. Defaults to the empty string which is equivalent to all statuses.
  • metricExecutionStatuses: If set only those covers are returned that have a corresponding metric log with a status corresponding to one of the given ExecutionStatus names. Allowed values are WAITING (metric computation has not started yet), RUNNING (metric computation is in progress), COMPLETED (metric computation has finished) and ERROR (An error has occured during metric computation). Multiple status names are separated using the “-“ delimiter. Defaults to the empty string which results in not filtering out any covers.
  • graphId: If set only those covers are returned that are based on the corresponding graph.

Retrieve

GET covers/{coverId}/graphs/{graphId}

Returns the cover with id {coverId} belonging to the graph with id {graphId} in a specified format.

Optional query parameters are:

  • outputFormat: The cover output format. Allowed values are META_XML, DEFAULT_XML and LABELED_MEMBERSHIP_MATRIX. Detaults to LABELED_MEMBERSHIP_MATRIX.

Delete

DELETE covers/{coverId}/graphs/{graphId}

Deletes the cover with id {coverId} belonging to the graph with id {graphId}. If the cover is still being created by an algorithm, the algorithm is terminated. If the cover is still being created by a ground truth benchmark, the benchmark is terminated and the corresponding graph is deleted as well. If metrics are running on the cover, they are terminated.

Compute

POST covers/graphs/{graphId}/algorithms

Creates a new cover by running an overlapping community detection algorithm on the graph with id {graphId}. The graph must have the creation method status completed. The request body must have the format <parameters> </parameters> where optionally more information may be included to set the algorithm’s execution parameters different from the default settings.

Optional query parameters are:

  • graphId: The id of graph with which the algorithm would creates cover.
  • name: The name for the cover. Defaults to unnamed.
  • algorithm: The algorithm to execute. Allowed values are RANDOM_WALK_LABEL_PROPAGATION_ALGORITHM, SPEAKER_LISTENER_LABEL_PROPAGATION_ALGORITHM, EXTENDED_SPEAKER_LISTENER_LABEL_PROPAGATION_ALGORITHM, SSK_ALGORITHM, LINK_COMMUNITIES_ALGORITHM, WEIGHTED_LINK_COMMUNITIES_ALGORITHM, CLIZZ_ALGORITHM, MERGING_OF_OVERLAPPING_COMMUNITIES_ALGORITHM and BINARY_SEARCH_RANDOM_WALK_LABEL_PROPAGATION_ALGORITHM. Defaults to SPEAKER_LISTENER_LABEL_PROPAGATION_ALGORITHM.
  • body: The body is the default parameter for the algorithm selected. If a get request is sent to the running ocd service like this, http://:8080/ocd/algorithms//parameters/default, a xml will be returned. Just put the xml data in the _body_ perimeter.
  • contentWeighting: The default value is false, it can be true when we use content based algorithms.
  • componentNodeCountFilter: May be set to a natural number to consider each connected component of a size smaller than the filter as one separate single community (independent of the algorithm).

For example, we upload a graph and we get the graph id 7. Now, we want to run SPEAKER_LISTENER_LABEL_PROPAGATION_ALGORITHM on it. Before running the algorithm, the default parameters for the algorithm in xml format from GET /ocd/algorithms/{CoverCreationType}/parameters/default API should be taken. Diffrent algorithms might have different parameters.

After that, make a post request to run the algorithm with the following parameters:

  • graphId: 7
  • name: Email_Cover
  • algorithm: SPEAKER_LISTENER_LABEL_PROPAGATION_ALGORITHM
  • body: <?xml version="1.0" encoding="UTF-16"?> <Parameters><Parameter><Name>memorySize</Name><Value>100</Value></Parameter><Parameter><Name>probabilityThreshold</Name><Value>0.15</Value></Parameter></Parameters>
  • contentWeighting: false
  • componentNodeCountFilter: 0

Then a cover is generated with a cover id 103.