Core Summarizer API

API change history

This API gives you RESTful access to our Core Summarizer NLP engine.

The Core Summarizer is designed to increase salience and reduce redundancy. This summarizer is the strongest at multi-document summaries and incrementally developing news stories. It's not designed to produce coherent, paragraph-like summaries, so we display its results as bullet points. The intended use case is to cover as many topics from the source documents as possible.

Summarize

This API endpoint allows users to create an automatic summary of multiple documents.

Try it

Request

Request URL

Request headers

  • (optional)
    string
    Media type of the body sent to the API.
  • string
    Subscription key which provides access to this API. Found in your Profile.

Request body

The POST request body should be a JSON of the following format:

{
  "summary_length": <how many sentences the output summary should contain>,
  "articles": [<Article objects>],
  "coref": false,
  "sort_by_salience": false,
  "include_all_sentences": false
}

The Article objects must have one of the following formats:

  • Web article URLs:

    { "url": "https://www.reuters.com/article/us-usa-immigration/white-house-softens-tone-after-threat-to-close-border-with-mexico-idUSKCN1RE1PE" }
    
  • Article title and body in plain text:

    {
      "title": "An unlikely contender, Sanders takes on ‘billionaire class’ in 2016 bid",
      "text": "The White House took a step back on Tuesday from a threat to close the U.S. border with Mexico, even as a redeployment of border officers in recent days has led to a slowdown of legal crossings and commerce at U.S. ports of entry there. White House spokeswoman Sarah Sanders said the..."
    }
    
  • Optionally, metadata information about the article can be embedded in the metadata field of the article as a JSON object. This information will be transparently passed-through to the summary response::

    { "metadata": {"origin": "Azure portal"} }
    

Optional Flags

There are several optional flags that can be set to true to modify the output of the summarizer API:

Flag Definition
include_quotes_in_summary When set to ‘true’ it will enforce extracting quotes from the article's full text and include them in the summary, when applicable. When the flag is set to 'false', even if extract_quotes flag is set it 'true' the quotes won't appear in the summary. Defaults to false.
extract_quotes When set to 'true' it will enforce extracting quotes from source's full text and list all of them separately outside the summary. Note that the quotes won't be included in the summary if include_quotes_in_summary flag is set to 'false'. Defaults to false.
sort_by_salience When set to ‘true’ it will order the sentences in the results by salience.

If set to false, the sentences in the summary will be sorted by order of appearance. Therefore, the first sentence is not necessarily the most salient point. It may just mean that the first sentence in summary appears before the second sentence in the source text.Defaults to false.
include_all_sentences Flag to determine whether all sentences in the input article(s) should be included in the summary. If set to true, the summary_length parameter is ignored. Defaults to false. Useful when combined with the sort_by_salience flag.
coref Flag to determine whether or not coreferences should be resolved. May cause a performance hit if set to true. Defaults to false. We do not recommend setting this flag to True unless coreference resolution is adversely needed.

Passing multiple input

You can perform multi-document summarization by passing multiple article URLs or text in the request body. The algorithm assumes the input articles are about the same topic or story.

If you pass two distinct stories, the summary for that cluster will not be very meaningful. For example, a story about a semiconductor company’s stock falling due to an earthquake and a story about the earthquake itself are related stories but are distinct stories.

Multiple input can be passed in this JSON format:

{
 "summary_length":"5",
	"articles":[
    {
        "type":"article",
        "url": "https://venturebeat.com/2017/05/03/microsoft-invests-in-agolo-a-startup-thats-fighting-information-overload-with-automated-summarizations/",
        "metadata":{"id": "PostmanAPIPortalProxyURL1", "key2": "value1"}
    },
    {
        "type":"article",
        "url": "https://www.geekwire.com/2017/microsoft-ventures-doubles-ai-new-investments-agolo-bonsai/",
        "metadata":{"id": "PostmanAPIPortalProxyURL2", "key2": "value2"}
    }
]}

Example

Requests

{
 "summary_length":"3",
 "sort_by_salience":"false",
 "extract_quotes":"false",
 "include_quotes_in_summary":"false",
 "articles":[
    {
        "type":"article", 
        "url": "https://apnews.com/f79cc6f6223b489ab73eb32f1f7187e5",
        "metadata":{"id": "PostmanAPIPortalProxyURL1", "key2": "value2"}
    }
]}

Responses

200 OK

A successful response will be a JSON object with the following attributes:

  • title: The title of the summary, if applicable.

  • photos: An array of photo URLs from the input items, if applicable.

  • summary: An array of summary sentences and item metadata, grouped by item. Each element is a Summary object.

Each Summary object contains the following fields:

  • metadata: An object that is a transparent pass-through of the metadata from the input Articles.

  • sentences: An array of sentences summarizing the input items.

  • ranks: Each element in this array corresponds to each sentence in the sentences array. The integer denotes where the sentence ranks in salience across all Articles. 1 is the most salient sentence in the entire collection of Articles.

  • quotes: An array of sentences that are quotes; Populated only if the flag extract_quotes is set to True.

Representations

{
    "title": "Trump's threat to close border stirs fears of economic harm",
    "title_candidates": [],
    "photos": [
        "https://storage.googleapis.com/afs-prod/media/media:9bd5b33247534157b904293deb6f0b49/3000.jpeg"
    ],
    "summary": [
        {
            "metadata": {
                "id": "PostmanAPIPortalProxyURL1",
                "key2": "value2",
                "url": "https://apnews.com/f79cc6f6223b489ab73eb32f1f7187e5",
                "source": "apnews.com",
                "icon": "https://apnews.com/branding/favicon/16.png"
            },
            "sentences": [
                "President Donald Trump’s threat to shut down the southern border raised fears Monday of dire economic consequences in the U. S. and an upheaval of daily life in a stretch of the country that relies on the international flow of not just goods and services but also students, families and workers.",
                "Politicians, business leaders and economists warned that such a move would block incoming shipments of fruits and vegetables, TVs, medical devices and other products and cut off people who commute to their jobs or school or come across to go shopping.",
                "The Trump administration said Monday as many as 2,000 U. S. inspectors who screen cargo and vehicles at ports of entry along the Mexican border may be reassigned to help handle the surge of migrants."
            ],
            "ranks": [
                3,
                2,
                1
            ],
            "quotes": []
        }
    ]
}

Code samples

@ECHO OFF

curl -v -X POST "https://api.agolo.com/nlp/v0.2/summarize"
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}" 
using System;
using System.Net.Http.Headers;
using System.Text;
using System.Net.Http;
using System.Web;

namespace CSHttpClientSample
{
    static class Program
    {
        static void Main()
        {
            MakeRequest();
            Console.WriteLine("Hit ENTER to exit...");
            Console.ReadLine();
        }
        
        static async void MakeRequest()
        {
            var client = new HttpClient();
            var queryString = HttpUtility.ParseQueryString(string.Empty);

            // Request headers
            client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "{subscription key}");

            var uri = "https://api.agolo.com/nlp/v0.2/summarize?" + queryString;

            HttpResponseMessage response;

            // Request body
            byte[] byteData = Encoding.UTF8.GetBytes("{body}");

            using (var content = new ByteArrayContent(byteData))
            {
               content.Headers.ContentType = new MediaTypeHeaderValue("< your content type, i.e. application/json >");
               response = await client.PostAsync(uri, content);
            }

        }
    }
}	
// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

public class JavaSample 
{
    public static void main(String[] args) 
    {
        HttpClient httpclient = HttpClients.createDefault();

        try
        {
            URIBuilder builder = new URIBuilder("https://api.agolo.com/nlp/v0.2/summarize");


            URI uri = builder.build();
            HttpPost request = new HttpPost(uri);
            request.setHeader("Content-Type", "application/json");
            request.setHeader("Ocp-Apim-Subscription-Key", "{subscription key}");


            // Request body
            StringEntity reqEntity = new StringEntity("{body}");
            request.setEntity(reqEntity);

            HttpResponse response = httpclient.execute(request);
            HttpEntity entity = response.getEntity();

            if (entity != null) 
            {
                System.out.println(EntityUtils.toString(entity));
            }
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

<!DOCTYPE html>
<html>
<head>
    <title>JSSample</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
    $(function() {
        var params = {
            // Request parameters
        };
      
        $.ajax({
            url: "https://api.agolo.com/nlp/v0.2/summarize?" + $.param(params),
            beforeSend: function(xhrObj){
                // Request headers
                xhrObj.setRequestHeader("Content-Type","application/json");
                xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key","{subscription key}");
            },
            type: "POST",
            // Request body
            data: "{body}",
        })
        .done(function(data) {
            alert("success");
        })
        .fail(function() {
            alert("error");
        });
    });
</script>
</body>
</html>
#import <Foundation/Foundation.h>

int main(int argc, const char * argv[])
{
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    
    NSString* path = @"https://api.agolo.com/nlp/v0.2/summarize";
    NSArray* array = @[
                         // Request parameters
                         @"entities=true",
                      ];
    
    NSString* string = [array componentsJoinedByString:@"&"];
    path = [path stringByAppendingFormat:@"?%@", string];

    NSLog(@"%@", path);

    NSMutableURLRequest* _request = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:path]];
    [_request setHTTPMethod:@"POST"];
    // Request headers
    [_request setValue:@"application/json" forHTTPHeaderField:@"Content-Type"];
    [_request setValue:@"{subscription key}" forHTTPHeaderField:@"Ocp-Apim-Subscription-Key"];
    // Request body
    [_request setHTTPBody:[@"{body}" dataUsingEncoding:NSUTF8StringEncoding]];
    
    NSURLResponse *response = nil;
    NSError *error = nil;
    NSData* _connectionData = [NSURLConnection sendSynchronousRequest:_request returningResponse:&response error:&error];

    if (nil != error)
    {
        NSLog(@"Error: %@", error);
    }
    else
    {
        NSError* error = nil;
        NSMutableDictionary* json = nil;
        NSString* dataString = [[NSString alloc] initWithData:_connectionData encoding:NSUTF8StringEncoding];
        NSLog(@"%@", dataString);
        
        if (nil != _connectionData)
        {
            json = [NSJSONSerialization JSONObjectWithData:_connectionData options:NSJSONReadingMutableContainers error:&error];
        }
        
        if (error || !json)
        {
            NSLog(@"Could not parse loaded json with error:%@", error);
        }
        
        NSLog(@"%@", json);
        _connectionData = nil;
    }
    
    [pool drain];

    return 0;
}
<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

$request = new Http_Request2('https://api.agolo.com/nlp/v0.2/summarize');
$url = $request->getUrl();

$headers = array(
    // Request headers
    'Content-Type' => 'application/json',
    'Ocp-Apim-Subscription-Key' => '{subscription key}',
);

$request->setHeader($headers);

$parameters = array(
    // Request parameters
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$request->setBody("{body}");

try
{
    $response = $request->send();
    echo $response->getBody();
}
catch (HttpException $ex)
{
    echo $ex;
}

?>
########### Python 2.7 #############
import httplib, urllib, base64

headers = {
    # Request headers
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.urlencode({
})

try:
    conn = httplib.HTTPSConnection('api.agolo.com')
    conn.request("POST", "/nlp/v0.2/summarize?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################

########### Python 3.2 #############
import http.client, urllib.request, urllib.parse, urllib.error, base64

headers = {
    # Request headers
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.parse.urlencode({
})

try:
    conn = http.client.HTTPSConnection('api.agolo.com')
    conn.request("POST", "/nlp/v0.2/summarize?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################
require 'net/http'

uri = URI('https://api.agolo.com/nlp/v0.2/summarize')

request = Net::HTTP::Post.new(uri.request_uri)
# Request headers
request['Content-Type'] = 'application/json'
# Request headers
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Request body
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|
    http.request(request)
end

puts response.body