Returning search context through API

cancel
Showing results for 
Search instead for 
Did you mean: 
tf_bds
Active Member

Returning search context through API

Hi,

I'm trying to use the REST API to run a search query on the files in Alfresco. I am able to replicate the results from the online interface, but would like to recieve the search term in it's context, as can be seen in the web interface results.

For example, when I search for "MacDougall" in the web interface, I get:

And using the API I submit:

http://127.0.0.1:8080/alfresco/service/api/search/keyword.rss?q=MacDougall&p={startPage?}&c={count?}... 

And get back:

<item>
<title>captmidn.txt</title>
<link>http://127.0.0.1:8080/alfresco/service/api/node/content/workspace/SpacesStore/e8758538-7b3a-47fa-894...</link>
<description></description>
<pubDate>2017-03-23T11:59:41.093Z</pubDate>
<guid isPermaLink="false">e8758538-7b3a-47fa-8941-d812c01c1b0d</guid>
</item>

As you can see, the correct result is returned, but is missing the keyword MacDougall in it's context as per the browser search.

Is there any way in which the API can return the string "It started out as just another Saturday. April 26, 1986. John R. MacDougall, 25, spent the day..."  along with the search result?

Many thanks Smiley Happy

8 Replies
afaust
Master

Re: Returning search context through API

The keyword search endpoint is an obsolete endpoint that is primarily intended for integrating Alfresco search as a search provider into the "quick search" bar of any browser (Open Search API). It is not the same as the search that is displayed in the Share UI. That one uses the /alfresco/s/slingshot/search endpoint. But you should also not use that with Alfresco 5.2, because it is an internal API. Instead use the new v1 ReST API for search.

tf_bds
Active Member

Re: Returning search context through API

Thanks a lot for you reply, I have now switched to the Search API as found in the Alfresco Content Services REST API Explorer 

So my request is now sent via JSON through POST /alfresco/api/-default-/public/search/versions/1/search 

However, in the guide to the search API in the REST API Explorer I still can't find a way to include "snippets" of the search terms in context as can be seen in the results in the Alfresco Share UI (see screenshot in my original post).

The search highlighting option looks relevant but I cannot get a response of the same form as the example (which includes snippets) so it makes no material difference to my query. Any ideas?

Thanks again for your quick reply.

tf_bds
Active Member

Re: Returning search context through API

Sorry to bug you again, but I'm struggling to find a way to do this still. Any chance you could point me in the right direction in terms of accessing the snippet of text with the search term appearing in it as per the UI on Share? Thanks a lot Smiley Happy

afaust
Master

Re: Returning search context through API

If you use the new ReST v1 Search endpoint (/alfresco/api/-default-/public/search/versions/1/search) you can use the following (simplified) JSON request to search for a simple keyword and map that to a predefined snippet. Details about what can be provided to that endpoint is available via the API Explorer webapp.

{
  "query": {
    "query": "_keyword:Test",
    "language": "afts"
  },
  "templates": [{"name": "_keyword","template": "|%name OR |%title OR |%description OR |%content"}]
}‍‍‍‍‍‍‍

Match highlights can be retrieved by adding the "highlight" parameter.

{
  "query": {
    "query": "_keyword:Test",
    "language": "afts"
  },
  "templates": [{"name": "_keyword","template": "|%name OR |%title OR |%description OR |%content"}],
  "highlight": {
    "prefix": "¿",
    "postfix": "?",
    "mergeContiguous": true,
    "fields": [
      {
        "field": "cm:title"
      },
      {
        "field": "description",
        "prefix": "(",
        "postfix": ")"
      }

    ]
  }
}
tf_bds
Active Member

Re: Returning search context through API

Thanks a lot for that. I have used this on the sample data set included in the Community Edition ("Web Site Design Project"), with some success. However, I am a little confused as to why sometimes the API returns the snippet exactly as I see it in the UI, and sometimes, the API does not return a snippet at all.

For example, for a search with the term "project":

In the UI I see:

And in the REST response, I get the expected result:

      {
        "entry": {
          "isFile": true,
          "createdByUser": {
            "id": "abeecher",
            "displayName": "Alice Beecher"
          },
          "modifiedAt": "2011-06-14T10:28:54.714+0000",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "application/pdf",
            "mimeTypeName": "Adobe PDF Document",
            "sizeInBytes": 381778,
            "encoding": "UTF-8"
          },
          "parentId": "e0856836-ed5e-4eee-b8e5-bd7e8fb9384c",
          "createdAt": "2011-02-15T21:26:54.600+0000",
          "isFolder": false,
          "search": {
            "score": 6.46758,
            "highlight": [
              {
                "field": "cm:title",
                "snippets": [
                  "¿Project? Contract for Green Energy"
                ]
              },
              {
                "field": "description",
                "snippets": [
                  "Contract for the Green Energy (project)"
                ]
              }
            ]
          },
          "modifiedByUser": {
            "id": "admin",
            "displayName": "Administrator"
          },
          "name": "Project Contract.pdf",
          "location": "nodes",
          "id": "1a0b110f-1e09-4ca2-b367-fe25e4964a4e"
        }
      },

Which returns the correct snippet as expected.

However, in the same results page, I can see in the UI:

But in the response from the API, I only see:

{
        "entry": {
          "isFile": true,
          "createdByUser": {
            "id": "mjackson",
            "displayName": "Mike Jackson"
          },
          "modifiedAt": "2011-06-14T10:28:57.221+0000",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "text/html",
            "mimeTypeName": "HTML",
            "sizeInBytes": 1175,
            "encoding": "UTF-8"
          },
          "parentId": "cdefb3a9-8f55-4771-a9e3-06fa370250f6",
          "createdAt": "2011-02-15T21:46:47.847+0000",
          "isFolder": false,
          "search": {
            "score": 0.30891252
          },
          "modifiedByUser": {
            "id": "admin",
            "displayName": "Administrator"
          },
          "name": "Main_Page",
          "location": "nodes",
          "id": "d6f3a279-ce86-4a12-8985-93b71afbb71d"
        }
      },

This is the correct result, but the snippet has not come through in the API response.

Is there anything I can change in the query or otherwise that can make sure all the snippets I see in the UI also come through in the API response?

Sorry about the confusion before, as I only just realised that the problem wasn't that the snippet never came through, just that it sometimes comes through and sometimes doesn't, even when it is visible in the UI.

I look forward to hearing your ideas.

Many thanks Smiley Happy

afaust
Master

Re: Returning search context through API

If you look at the request definition via the api-explorer you should see options you can include with your hightlight request. I can see 'maxAnalyzedChars', 'fragmentSize' and 'snippetCount' that all have something to do how many snippets you are getting and up to what position within the full text.

tf_bds
Active Member

Re: Returning search context through API

Thanks Axel, looks like I managed to figure it out.

The highlighting was not happening in the content field, which is why it was inconsistent. For reference, the query that worked was:

{
  "query": {
    "language": "afts",
    "query": "project"
   
  },
  "paging": {
    "maxItems": 100,
    "skipCount": 0
  },

  "scope": {
    "locations": "nodes"
  },

"highlight": {
    "prefix": "¿",
    "postfix": "?",

    "snippetCount":3,
    "mergeContiguous": true,
      "fields": [
          {
              "field": "cm:title"
          },
          {
              "field": "description"
          },
          {
              "field": "cm:content"
          }

     ]

  }
 
}

Thanks for your help Smiley Happy

joga
Member II

Re: Returning search context through API

Hello,

I'm currently doing a webscript with this query:

		var def = 
		{ 
			"query": 'PATH:"//*" AND TEXT:"test"',
			"highlight": {
				    "prefix": "¿",
					"postfix": "?",

					"snippetCount":3,
					"mergeContiguous": true,
				"fields": [
					{
						"field": "cm:title",
				  	},
					{
						"field": "cm:content",
					},
				  	{
						"field": "cm:description",
				  	}
				]
			}
		}; 

The idea is to highlight the parts in the documents where the TEXT "test" is found.

My problem is that i dont know in which property i can find the matching sentence in the document when im building my object for example:

  project.properties["acmedl:projectName"] = "Project B1";
    project.properties["acmedl:projectNumber"] = "1";
    project.properties["acmedl:projectDescription"] = "Project B1 handling important stuff";
    project.properties["acmedl:projectStartDate"] = startDate;
    project.properties["acmedl:projectActive"] = true;

and render it to the view.

If you guys have any ideas. That would be awesome! Thx Smiley Happy

PS: Sorry for my eng

Joga