Cockatrice 0.4.0 documentation¶
Cockatrice is the open source search and indexing server written in Python that provides scalable indexing and search, faceting, hit highlighting and advanced analysis/tokenization capabilities.
Features¶
- Full-text search and indexing
- Faceting
- Result highlighting
- Easy deployment
- Bringing up cluster
- Index replication
- An easy-to-use RESTful API
Source Codes¶
Requirements¶
Python 3.x interpreter
Contents¶
Getting Started¶
Installation of Cockatrice on Unix-compatible or Windows servers generally requires Python interpreter and pip command.
Installing Cockatrice¶
Cockatrice is registered to PyPi now, so you can just run following command:
$ pip install cockatrice
Starting Cockatrice¶
Cockatrice includes a command line interface tool called bin/cockatrice. This tool allows you to start Cockatrice in your system.
To use it to start Cockatrice you can simply enter:
$ cockatrice server
This will start Cockatrice, listening on default port (8080).
$ curl -s -X GET http://localhost:8080/
You can see the result in plain text format. The result of the above command is:
cockatrice 0.1.0 is running.
Schema management¶
First of all, you need to create a schema definition. Cockatrice fully supports the field types, analyzers, tokenizers and filters provided by Whoosh. This section explains how to describe schema definition.
Schema Design¶
Cockatrice defines the schema in YAML format. YAML is a human friendly data serialization standard for all programming languages.
The following items are defined in YAML:
- schema
- default_search_field
- field_types
- analyzers
- tokenizers
- filters
Schema¶
The schema is the place where you tell Cockatrice how it should build indexes from input documents.
schema:
<FIELD_NAME>:
field_type: <FIELD_TYPE>
args:
<ARG_NAME>: <ARG_VALUE>
...
<FIELD_NAME>
: The field name in the document.<FIELD_TYPE>
: The field type used in this field.<ARG_NAME>
: The argument name to use constructing the field.<ARG_VALUE>
: The argument value to use constructing the field.
For example, id
field used as a unique key is defined as following:
schema:
id:
field_type: id
args:
unique: true
stored: true
Default Search Field¶
The query parser uses this as the field for any terms without an explicit field.
default_search_field: <FIELD_NAME>
<FIELD_NAME>
: Uses this as the field name for any terms without an explicit field name.
For example, uses text
field as default search field as following:
default_search_field: text
Field Types¶
The field type defines how Cockatrice should interpret data in a field and how the field can be queried. There are many field types included with Whoosh by default, and they can also be defined directly in YAML.
field_types:
<FIELD_TYPE>:
class: <FIELD_TYPE_CLASS>
args:
<ARG_NAME>: <ARG_VALUE>
<FIELD_TYPE>
: The field type name.<FIELD_TYPE_CLASS>
: The field type class.<ARG_NAME>
: The argument name to use constructing the field type.<ARG_VALUE>
: The argument value to use constructing the field type.
For example, defines text
field type as following:
field_types:
text:
class: whoosh.fields.TEXT
args:
analyzer:
phrase: true
chars: false
stored: false
field_boost: 1.0
multitoken_query: default
spelling: false
sortable: false
lang: null
vector: null
spelling_prefix: spell_
Analyzers¶
class
element whose class attribute is a fully qualified Python class name.tokenizer
and filters
to use, in the order you want them to run.analyzers:
<ANALYZER_NAME>:
class: <ANALYZER_CLASS>
args:
<ARG_NAME>: <ARG_VALUE>
<ANALYZER_NAME>:
tokenizer: <TOKENIZER_NAME>
filters:
- <FILTER_NAME>
<ANALYZER_NAME>
: The analyzer name.<ANALYZER_CLASS>
: The analyzer class.<ARG_NAME>
: The argument name to use constructing the analyzer.<ARG_VALUE>
: The argument value to use constructing the analyzer.<TOKENIZER_NAME>
: The tokenizer name to use in the analyzer chain.<FILTER_NAME>
: The filter name to use in the analyzer chain.
For example, defines analyzers using class
, tokenizer
and filters
as follows:
analyzers:
simple:
class: whoosh.analysis.SimpleAnalyzer
args:
expression: "\\w+(\\.?\\w+)*"
gaps: false
ngram:
tokenizer: ngram
filters:
- lowercase
Tokenizers¶
The job of a tokenizer is to break up a stream of text into tokens, where each token is (usually) a sub-sequence of the characters in the text.
tokenizers:
<TOKENIZER_NAME>:
class: <TOKENIZER_CLASS>
args:
<ARG_NAME>: <ARG_VALUE>
<TOKENIZER_NAME>
: The tokenizer name.<TOKENIZER_CLASS>
: The tokenizer class.<ARG_NAME>
: The argument name to use constructing the tokenizer.<ARG_VALUE>
: The argument value to use constructing the tokenizer.
For example, defines tokenizer as follows:
tokenizers:
ngram:
class: whoosh.analysis.NgramTokenizer
args:
minsize: 2
maxsize: null
Filters¶
The job of a filter is usually easier than that of a tokenizer since in most cases a filter looks at each token in the stream sequentially and decides whether to pass it along, replace it or discard it.
filters:
<FILTER_NAME>:
class: <FILTER_CLASS>
args:
<ARG_NAME>: <ARG_VALUE>
<FILTER_NAME>
: The filter name.<FILTER_CLASS>
: The filter class.<ARG_NAME>
: The argument name to use constructing the filter.<ARG_VALUE>
: The argument value to use constructing the filter.
For example, defines filter as follows:
filters:
stem:
class: whoosh.analysis.StemFilter
args:
lang: en
ignore: null
cachesize: 50000
Example¶
Refer to the example for how to define schema.
https://github.com/mosuka/cockatrice/blob/master/example/schema.yaml
More information¶
See documents for more information.
Index management¶
You need to create an index after starting Cockatrice. Also you can delete indexes that are no longer needed.
Create an index¶
A schema is required to create an index, you need to put the schema in the request. Create an index by the following command:
$ curl -s -X PUT -H "Content-type: text/x-yaml" --data-binary @./conf/schema.yaml http://localhost:8080/myindex?use_ram_storage=True | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 0.08018112182617188
}
Get an index¶
If you created an index, you can retrieve index information by the following command:
$ curl -s -X GET http://localhost:8080/myindex | jq .
The result of the above command is:
{
"index": {
"doc_count": 0,
"doc_count_all": 0,
"last_modified": -1,
"latest_generation": 0,
"name": "myindex",
"storage": {
"files": [
"_myindex_0.toc"
],
"folder": "",
"readonly": false,
"supports_mmap": false
},
"version": -111
},
"status": {
"code": 200,
"description": "Request fulfilled, document follows",
"phrase": "OK"
},
"time": 0.0014028549194335938
}
Delete an index¶
You can delete indexes that are no longer needed. Delete an index by the following command:
$ curl -s -X DELETE http://localhost:8080/myindex | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 0.0006439685821533203
}
Document management¶
Once indices are created, you can update indices.
Index a document¶
If you already created an index named myindex
, indexing a document by the following command:
$ curl -s -X PUT -H "Content-Type:application/json" http://localhost:8080/myindex/_doc/1 -d @./example/doc1.json | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 0.00015020370483398438
}
Get a document¶
If you already indexed a document ID 1
in myindex
, getting a document that specifying ID from myindex
by the following command:
$ curl -s -X GET http://localhost:8080/myindex/_doc/1 | jq .
You can see the result in JSON format. The result of the above command is:
{
"doc": {
"fields": {
"contributor": "43.225.167.166",
"id": "1",
"text": "A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload. The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web.",
"timestamp": "20180704054100",
"title": "Search engine (computing)"
}
},
"status": {
"code": 200,
"description": "Request fulfilled, document follows",
"phrase": "OK"
},
"time": 0.011947870254516602
}
Delete a document¶
Deleting a document from myindex
by the following command:
$ curl -s -X DELETE http://localhost:8080/myindex/_doc/1 | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 6.699562072753906e-05
}
Index documents in bulk¶
Indexing documents in bulk by the following command:
$ curl -s -X PUT -H "Content-Type:application/json" http://localhost:8080/myindex/_docs -d @./example/bulk_index.json | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 0.00018596649169921875
}
Delete documents in bulk¶
Deleting documents in bulk by the following command:
$ curl -s -X DELETE -H "Content-Type:application/json" http://localhost:8080/myindex/_docs -d @./example/bulk_delete.json | jq .
You can see the result in JSON format. The result of the above command is:
{
"status": {
"code": 202,
"description": "Request accepted, processing continues off-line",
"phrase": "Accepted"
},
"time": 0.00232696533203125
}
Search documents¶
Once created an index and added documents to it, you can search for those documents.
Searching documents¶
Searching documents by the following command:
$ curl -s -X GET http://localhost:8080/myindex/_search?query=search | jq .
You can see the result in JSON format. The result of the above command is:
{
"results": {
"hits": [
{
"doc": {
"fields": {
"contributor": "KolbertBot",
"id": "3",
"text": "Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience. \"Enterprise search\" is used to describe the software of search information within an enterprise (though the search function and its results may still be public). Enterprise search can be contrasted with web search, which applies search technology to documents on the open web, and desktop search, which applies search technology to the content on a single computer. Enterprise search systems index data and documents from a variety of sources such as: file systems, intranets, document management systems, e-mail, and databases. Many enterprise search systems integrate structured and unstructured data in their collections.[3] Enterprise search systems also use access controls to enforce a security policy on their users. Enterprise search can be seen as a type of vertical search of an enterprise.",
"timestamp": "20180129125400",
"title": "Enterprise search"
}
},
"pos": 0,
"rank": 0,
"score": 1.6081254289003828
},
{
"doc": {
"fields": {
"contributor": "Nurg",
"id": "5",
"text": "Federated search is an information retrieval technology that allows the simultaneous search of multiple searchable resources. A user makes a single query request which is distributed to the search engines, databases or other query engines participating in the federation. The federated search then aggregates the results that are received from the search engines for presentation to the user. Federated search can be used to integrate disparate information resources within a single large organization (\"enterprise\") or for the entire web. Federated search, unlike distributed search, requires centralized coordination of the searchable resources. This involves both coordination of the queries transmitted to the individual search engines and fusion of the search results returned by each of them.",
"timestamp": "20180716000600",
"title": "Federated search"
}
},
"pos": 1,
"rank": 1,
"score": 1.5904654156439162
},
{
"doc": {
"fields": {
"contributor": "Aistoff",
"id": "2",
"text": "A web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to as search engine results pages (SERPs). The information may be a mix of web pages, images, and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time information by running an algorithm on a web crawler. Internet content that is not capable of being searched by a web search engine is generally described as the deep web.",
"timestamp": "20181005132100",
"title": "Web search engine"
}
},
"pos": 2,
"rank": 2,
"score": 1.515225291088596
},
{
"doc": {
"fields": {
"contributor": "43.225.167.166",
"id": "1",
"text": "A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload. The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web.",
"timestamp": "20180704054100",
"title": "Search engine (computing)"
}
},
"pos": 3,
"rank": 3,
"score": 1.4922266045792822
},
{
"doc": {
"fields": {
"contributor": "Citation bot",
"id": "4",
"text": "A distributed search engine is a search engine where there is no central server. Unlike traditional centralized search engines, work such as crawling, data mining, indexing, and query processing is distributed among several peers in a decentralized manner where there is no single point of control.",
"timestamp": "20180930171400",
"title": "Distributed search engine"
}
},
"pos": 4,
"rank": 4,
"score": 1.4257952540764172
}
],
"is_last_page": true,
"page_count": 1,
"page_len": 5,
"page_num": 1,
"total": 5
},
"status": {
"code": 200,
"description": "Request fulfilled, document follows",
"phrase": "OK"
},
"time": 0.012568235397338867
}
Cluster management¶
You already know how to start Cockatrice in standalone mode, but that is not fault tolerant. If you need to increase the fault tolerance, bring up a cluster.
Create a cluster with static membership¶
Cockatrice is easy to bring up the cluster. You can bring up 3-node cluster with static membership by following commands:
$ cockatrice server --bind-addr=127.0.0.1:7070 --peer-addr=127.0.0.1:7071 --peer-addr=127.0.0.1:7072 --index-dir=/tmp/cockatrice/node1/index --http-port=8080
$ cockatrice server --bind-addr=127.0.0.1:7071 --peer-addr=127.0.0.1:7070 --peer-addr=127.0.0.1:7072 --index-dir=/tmp/cockatrice/node2/index --http-port=8081
$ cockatrice server --bind-addr=127.0.0.1:7072 --peer-addr=127.0.0.1:7070 --peer-addr=127.0.0.1:7071 --index-dir=/tmp/cockatrice/node3/index --http-port=8082
Above example shows each Cockatrice node running on the same host, so each node must listen on different ports. This would not be necessary if each node ran on a different host.
So you have a 3-node cluster. That way you can tolerate the failure of 1 node.
You can check the cluster with the following command:
$ cockatrice status --bind-addr=127.0.0.1:7070 | jq .
You can see the result in JSON format. The result of the above command is:
{
"message": "SUCCESS",
"data": {
"version": "0.3.4",
"revision": "2c8a3263d0dbe3f8d7b8a03e93e86d385c1de558",
"self": "127.0.0.1:7070",
"state": 2,
"leader": "127.0.0.1:7070",
"partner_nodes_count": 2,
"partner_node_status_server_127.0.0.1:7071": 2,
"partner_node_status_server_127.0.0.1:7072": 2,
"readonly_nodes_count": 0,
"unknown_connections_count": 1,
"log_len": 56,
"last_applied": 59,
"commit_idx": 59,
"raft_term": 110,
"next_node_idx_count": 2,
"next_node_idx_server_127.0.0.1:7071": 60,
"next_node_idx_server_127.0.0.1:7072": 60,
"match_idx_count": 2,
"match_idx_server_127.0.0.1:7071": 59,
"match_idx_server_127.0.0.1:7072": 59,
"leader_commit_idx": 59,
"uptime": 10860,
"self_code_version": 0,
"enabled_code_version": 0
}
}
Recommend 3 or more odd number of nodes in the cluster. In failure scenarios, data loss is inevitable, so avoid deploying single nodes.
Once cluster is created, you can create indices. let’s create an index to 127.0.0.1:8080 by the following command:
$ curl -s -X PUT -H "Content-type: text/x-yaml" --data-binary @./conf/schema.yaml http://localhost:8080/myindex | jq .
If the above command succeeds, same index will be created on all the nodes in the cluster. Check your index on each nodes.
$ curl -s -X GET http://localhost:8080/myindex | jq .
$ curl -s -X GET http://localhost:8081/myindex | jq .
$ curl -s -X GET http://localhost:8082/myindex | jq .
Let’s index a document to 127.0.0.1:8080 by the following command:
$ curl -s -X PUT -H "Content-Type:application/json" http://localhost:8080/myindex/_doc/1 -d @./example/doc1.json | jq .
If the above command succeeds, same document will be indexed on all the nodes in the cluster. Check your document on each nodes.
$ curl -s -X GET http://localhost:8080/myindex/_doc/1 | jq .
$ curl -s -X GET http://localhost:8081/myindex/_doc/1 | jq .
$ curl -s -X GET http://localhost:8082/myindex/_doc/1 | jq .
Create a cluster with dynamic membership by manual operation¶
Dynamic membership change allows you to add or remove nodes from your cluster without cluster restart. This section describes how to scale the cluster. Let’s start first node by the following command:
$ cockatrice server --bind-addr=127.0.0.1:7070 --index-dir=/tmp/cockatrice/node1/index --http-port=8080
Then, execute join command with new node on one of the existing nodes.
$ cockatrice join --bind-addr=127.0.0.1:7070 --join-addr=127.0.0.1:7071
127.0.0.1:7070
is one of the existing cluster nodes, and 127.0.0.1:7071
is the node you want to add.
The above command will wait until the new node starts up. You need to launch new node with correct initial peers on the other terminal window by following:
$ cockatrice server --bind-addr=127.0.0.1:7071 --index-dir=/tmp/cockatrice/node2/index --peer-addr=127.0.0.1:7070 --http-port=8081
Also, recommend 3 or more odd number of nodes in the cluster due to avoid split brain. You should launch one more new node with correct initial peers like following:
$ cockatrice join --bind-addr=127.0.0.1:7070 --join-addr=127.0.0.1:7072
$ cockatrice server --bind-addr=127.0.0.1:7072 --index-dir=/tmp/cockatrice/node3/index --peer-addr=127.0.0.1:7070 --peer-addr=127.0.0.1:7071 --http-port=8082
Create a cluster with dynamic membership without manual operation¶
The above section described how to create a cluster with dynamic membership by manual operation. Although it is a method that is used when the administrator needs accurate operation, it provides easier way to create a cluster with dynamic membership without manual operations. Start first node in standalone mode by following command:
$ cockatrice server --bind-addr=127.0.0.1:7070 --index-dir=/tmp/cockatrice/node1/index --http-port=8080
$ cockatrice server --bind-addr=127.0.0.1:7071 --seed-addr=127.0.0.1:7070 --index-dir=/tmp/cockatrice/node2/index --http-port=8081
$ cockatrice server --bind-addr=127.0.0.1:7072 --seed-addr=127.0.0.1:7070 --index-dir=/tmp/cockatrice/node3/index --http-port=8082
Just add --seed-addr
parameter and start it. These are the same as that create a cluster with dynamic membership by manual operation. The above command performs register a new node and starts one at the same time.
Monitoring Cockatrice¶
The /-/_metrics
endpoint provides access to all the metrics. Cockatrice outputs metrics in Prometheus exposition format.
Get metrics¶
If you already started a cockatrice, you can get metrics by the following command:
$ curl -s -X GET http://localhost:8080/-/_metrics
You can see the result in Prometheus exposition format. The result of the above command is:
# HELP cockatrice_http_requests_total The number of requests.
# TYPE cockatrice_http_requests_total counter
cockatrice_http_requests_total{endpoint="/myindex",method="PUT",status_code="202"} 1.0
cockatrice_http_requests_total{endpoint="/myindex/_docs",method="PUT",status_code="202"} 1.0
# HELP cockatrice_http_requests_bytes_total A summary of the invocation requests bytes.
# TYPE cockatrice_http_requests_bytes_total counter
cockatrice_http_requests_bytes_total{endpoint="/myindex",method="PUT"} 7376.0
cockatrice_http_requests_bytes_total{endpoint="/myindex/_docs",method="PUT"} 3909.0
# HELP cockatrice_http_responses_bytes_total A summary of the invocation responses bytes.
# TYPE cockatrice_http_responses_bytes_total counter
cockatrice_http_responses_bytes_total{endpoint="/myindex",method="PUT"} 135.0
cockatrice_http_responses_bytes_total{endpoint="/myindex/_docs",method="PUT"} 137.0
# HELP cockatrice_http_requests_duration_seconds The invocation duration in seconds.
# TYPE cockatrice_http_requests_duration_seconds histogram
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.005",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.01",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.025",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.05",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.075",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.1",method="PUT"} 0.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.25",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="0.75",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="1.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="2.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="5.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="7.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="10.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex",le="+Inf",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_count{endpoint="/myindex",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_sum{endpoint="/myindex",method="PUT"} 0.22063422203063965
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.005",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.01",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.025",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.05",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.075",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.1",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.25",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="0.75",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="1.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="2.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="5.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="7.5",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="10.0",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_bucket{endpoint="/myindex/_docs",le="+Inf",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_count{endpoint="/myindex/_docs",method="PUT"} 1.0
cockatrice_http_requests_duration_seconds_sum{endpoint="/myindex/_docs",method="PUT"} 0.0020329952239990234
# HELP cockatrice_index_documents The number of documents.
# TYPE cockatrice_index_documents gauge
cockatrice_index_documents{index_name="myindex"} 5.0
Health check¶
Cockatrice provides a health endpoint which returns 200 if Cockatrice is live or ready to response to queries.
Liveness probe¶
To get the current liveness probe is following:
$ curl -s -X GET http://localhost:8080/-/_health/liveness | jq .
You can see the result in JSON format. The result of the above command is:
{
"liveness": true,
"status": {
"code": 200,
"description": "Request fulfilled, document follows",
"phrase": "OK"
},
"time": 2.288818359375e-05
}
Readiness probe¶
To get the current readiness probe is following:
$ curl -s -X GET http://localhost:8080/-/_health/readiness | jq .
You can see the result in JSON format. The result of the above command is:
{
"readiness": true,
"status": {
"code": 200,
"description": "Request fulfilled, document follows",
"phrase": "OK"
},
"time": 0.0001971721649169922
}
RESTful API Reference¶
Index APIs¶
The Index API is used to manage individual indices.
Create Index API¶
The Create Index API is used to manually create an index in Cockatrice. The most basic usage is the following:
PUT /rest/<INDEX_NAME>?sync=<SYNC>
---
schema:
id:
field_type: id
args:
unique: true
stored: true
...
<INDEX_NAME>
: The index name.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.- Request Body: YAML formatted schema definition.
Get Index API¶
The Get Index API allows to retrieve information about the index. The most basic usage is the following:
GET /rest/<INDEX_NAME>
<INDEX_NAME>
: The index name.
Delete Index API¶
The Delete Index API allows to delete an existing index. The most basic usage is the following:
DELETE /rest/<INDEX_NAME>?sync=<SYNC>
<INDEX_NAME>
: The index name.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.
Document APIs¶
Get Document API¶
GET /rest/<INDEX_NAME>/_doc/<DOC_ID>
<INDEX_NAME>
: The index name.<DOC_ID>
: The document ID to retrieve.
Index Document API¶
PUT /rest/<INDEX_NAME>/_doc/<DOC_ID>?sync=<SYNC>
{
"name": "Cockatrice",
...
}
<INDEX_NAME>
: The index name.<DOC_ID>
: The document ID to index.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.- Request Body: JSON formatted fields definition.
Delete Document API¶
DELETE /rest/<INDEX_NAME>/_doc/<DOC_ID>?sync=<SYNC>
<INDEX_NAME>
: The index name.<DOC_ID>
: The document ID to delete.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.
Index Documents API¶
PUT /rest/<INDEX_NAME>/_docs?sync=<SYNC>
[
{
"id": "1",
"name": "Cockatrice"
},
{
"id": "2",
...
]
<INDEX_NAME>
: The index name.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.- Request Body: JSON formatted documents definition.
Delete Documents API¶
DELETE /rest/<INDEX_NAME>/_docs?sync=<SYNC>
[
"1",
"2",
...
]
<INDEX_NAME>
: The index name.<SYNC>
: Specifies whether to execute the command synchronously or asynchronously. IfTrue
is specified, command will execute synchronously. Default isFalse
, command will execute asynchronously.- Request Body: JSON formatted document ids definition.
Search APIs¶
Search API¶
GET /rest/<INDEX_NAME>?query=<QUERY>&search_field=<SEARCH_FIELD>&page_num=<PAGE_NUM>&page_len=<PAGE_LEN>
<INDEX_NAME>
: The index name to search.<QUERY>
: The unicode string to search index.<SEARCH_FIELD>
: Uses this as the field for any terms without an explicit field.<PAGE_NUM>
: The page number to retrieve, starting at1
for the first page.<PAGE_LEN>
: The number of results per page.