NoCoDe's STORY: Elasticsearch 시작하기 (Elasticsearch Getting Started Webinar)

Elastic 제품 포트폴리오 (The Elastic Product Portfolio)

[이미지 출처: www.elastic.co]

0. Elasticsearch의 특징

real-time

distributed

search

analytics

표 1 관계형 데이터베이스와 elasticsearch 용어 비교

관계형 데이터베이스	elasticsearch
Database	Index
Table	Type
Row	Document
Column	Field
Schema	Mapping
Index	Everything is indexed
SQL	Query DSL

1. Elasticsearch 설치

1.1 Java 다운로드 및 설치

1.1.1 사이트 다운로드

http://www.oracle.com/technetwork/java/javase/downloads/index.html

1.1.2 리눅스 command 다운로드

32 bit linux use this:

curl -v -j -k -L -H "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u51-b16/jdk-8u51-linux-i586.rpm > jdk-8u51-linux-i586.rpm

For 64 bit:

curl -v -j -k -L -H "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u51-b16/jdk-8u51-linux-x64.rpm > jdk-8u51-linux-x64.rpm

Install the rpm:

rpm -ivh jdk-8u51-linux-i586.rpm

1.2 Elasticsearch 지원 Java 버전

Product and JVM

1.3 Elasticsearch 다운로드

1.3.1 사이트 다운로드

1.3.2 리눅스 command 다운로드

curl -L -O http://download.elasticsearch.org/PATH/TO/VERSION.zip

unzip elasticsearch-$VERSION.zip

cd elasticsearch-$VERSIO

1.3.3 Elasticsearch 디렉토리 구성

$tree -L 2

├── LICENSE.txt

├── NOTICE.txt

├── README.textile

├── bin

│ ├── elasticsearch

│ ├── elasticsearch-service-mgr.exe

│ ├── elasticsearch-service-x64.exe

│ ├── elasticsearch-service-x86.exe

│ ├── elasticsearch.bat

│ ├── elasticsearch.in.bat

│ ├── elasticsearch.in.sh

│ ├── plugin

│ ├── plugin.bat

│ └── service.bat

├── config

│ ├── elasticsearch.yml

│ └── logging.yml

├── data

│ └── elasticsearch

├── lib

│ ├── antlr-runtime-3.5.jar

│ ├── apache-log4j-extras-1.2.17.jar

│ ├── asm-4.1.jar

│ ├── asm-commons-4.1.jar

│ ├── elasticsearch-1.6.0.jar

│ ├── groovy-all-2.4.0.jar

│ ├── jna-4.1.0.jar

│ ├── jts-1.13.jar

│ ├── log4j-1.2.17.jar

│ ├── lucene-analyzers-common-4.10.4.jar

│ ├── lucene-core-4.10.4.jar

│ ├── lucene-expressions-4.10.4.jar

│ ├── lucene-grouping-4.10.4.jar

│ ├── lucene-highlighter-4.10.4.jar

│ ├── lucene-join-4.10.4.jar

│ ├── lucene-memory-4.10.4.jar

│ ├── lucene-misc-4.10.4.jar

│ ├── lucene-queries-4.10.4.jar

│ ├── lucene-queryparser-4.10.4.jar

│ ├── lucene-sandbox-4.10.4.jar

│ ├── lucene-spatial-4.10.4.jar

│ ├── lucene-suggest-4.10.4.jar

│ ├── sigar

│ └── spatial4j-0.4.1.jar

├── logs

│ ├── elasticsearch.log

│ ├── elasticsearch_index_indexing_slowlog.log

│ └── elasticsearch_index_search_slowlog.log

└── plugins

├── bigdesk

├── head

└── kopf

2. 기본 설정

elasticsearch.yml

1. cluster.name을 설정한다.

ex: clustre.name: elasticsearch_dev01

2. node.name을 설정한다.

ex: node.name: search_dev01

3. JVM의 스왑을 방지하려면 아래 설정 값을 true로 한다.

bootstrap.mlockall: true

4. 호스트 설정

network.host: 127.0.0.1

5. http port 설정

http.port: 9200

6. Elasticsearch 실행

~/bin/elasticsearch

curl -XGET localhost:9200/_cluster/health

curl -XGET localhost:9200/_cluster/health?pretty

{

"cluster_name" : "elasticsearch_dev01",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 1,

"number_of_data_nodes" : 1,

"active_primary_shards" : 0,

"active_shards" : 0,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0

}

Head 플러그인 설치

./plugin -i mobz/elasticsearch-head

Marvel 플러그인 설치

./plugin -i elasticsearch/marvel/latest

http://localhost:9200/_plugin/marvel/kibana/index.html

3. CRUD Operations

http://localhost:9200/_plugin/marvel/sense/index.html

#____________________________________________________

# Document CRUD operation

#____________________________________________________

# Index a JSON document

# ---- Index name

# |

# | ----- Type name

# | |

# | | ---- Doc ID

# | | |

# V V V

PUT /library/books/1

{

"title": "A fly on the wall",

"name": {

"first": "Drosophila",

"last": "Melanogaster"

"publish_date": "2015-07-31T12:00:00-0400",

"price": 19.95

}

#____________________________________________________

# 입력한 문서 ID 데이터 얻기

GET /library/books/1

#____________________________________________________

# DOC ID없이 저장하기

POST /library/books/

{

"title": "Adventures of Strange-Foot Smooth",

"name": {

"first": "Xenopus",

"last": "laevis"

"publish_date": "2015-07-31T12:12:00-0400",

"price": 5.99

}

#____________________________________________________

# GET 할때는 위에서 자동 생성된 ID가 필요하다.

GET /library/books/AU7iHhAcxZSlo5TsOhLA

#____________________________________________________

# 문서 전체업데이트

PUT /library/books/1

{

"title": "A fly on the wall PART2",

"name": {

"first": "Drosophila",

"last": "Melanogaster"

"publish_date": "2015-07-31T12:00:00-0400",

"price": 29.95

}

GET /library/books/1

#____________________________________________________

# 문서 업데이트 API

POST /library/books/1/_update

{

"doc": {

"price" : 10

}

GET /library/books/1

#____________________________________________________

# 문서 업데이트 API - 필드추가

POST /library/books/1/_update

{

"doc": {

"cn_price" : 13

}

GET /library/books/1

#____________________________________________________

# 문서 삭제

DELETE /library/books/1

GET /library/books/1

# 참고 URL

# https://www.elastic.co/guide/en/elasticsearch/guide/current/data-in-data-out.html

#______________________________________________

# Bulk indexing and Search

DELETE /library/books/

#______________________________________________

# 여러 문서의 색인이 필요할때, bulk API를 사용하면 된다.

POST /library/books/_bulk

{ "index": {"_id": 1}}

{ "title": "The quick brown fox", "price": 5}

{ "index": {"_id": 2}}

{ "title": "The quick brown fox jumps over the lazy dog", "price": 15}

{ "index": {"_id": 3}}

{ "title": "The quick brown fox jumps over the quick dog", "price": 8 }

{ "index": {"_id": 4}}

{ "title": "Brown fox brown dog", "price": 2}

{ "index": {"_id": 5}}

{ "title": "Lazy dog", "price": 9}

# 참고 URL

https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html

4. Query & Search syntax

#____________________________________________________

# 기본적인 검색 수행하기

# 문서 검색 *all*

GET /library/books/_search

#____________________________________________________

# “fox”가 포함된 모든 문서 검색

GET /library/books/_search

{

"query" : {

"match" : {

"title": "fox"

}

#____________________________________________________

# “quick” and “dog” 검색하는 방법?

GET /library/books/_search

{

"query" : {

"match" : {

"title": “quick dog"

}

#____________________________________________________

# 문서 내의 구문검색

GET /library/books/_search

{

"query" : {

"match_phrase" : {

"title": “quick dog"

}

# 참고 URL

https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html

#____________________________________________________

# boolean 조합검색

# 검색 형식

GET /library/books/_search

{

"query" : {

"bool": {

"must": [

{}

“must_not” : [

{}

“should”: [

{}

]

}

# “quick” and “lazy dog”이 포함된 모든 문서를 찾아보자

GET /library/books/_search

{

"query" : {

"bool": {

"must": [

{

"match": {

“title”:"quick"

}

{

"match_phrase": {

"title": “lazy dog"

}

]

}

# 조합검색에서 다른 효과를 주기 위해서 boost를 사용한다.

# 검색결과 품질을 높일 수 있다.

GET /library/books/_search

{

"query" : {

"bool": {

"should": [

{

"match_phrase" :{

"title" : {

"query": “quick dog",

"boost" : 0.5

}

{

"match_phrase": {

"title": {

"query": “lazy dog"

}

]

}

#____________________________________________________

# 검색 키워드 하이라이팅

GET /library/books/_search

{

"query" : {

"bool": {

"should": [

{

"match_phrase" :{

"title" : {

"query": "quick dog",

"boost" : 0.5

}

{

"match_phrase": {

"title": {

"query": "lazy dog"

}

]

}

"highlight" : {

"fields": {

"title" : {}

}

#____________________________________________________

# 필터링 검색

# 필터링은 쿼리(query)보다 빠르다.

# 필터링 검색 - 가격이 5$ 이상인 모든 책을 찾기

GET /library/books/_search

{

"query" : {

"filtered": {

"filter": {

"range": {

"price": {

"gte": 5

}

# 필터링 - 키워드와 가격이 5$ 이상인 모든 책을 찾기

GET /library/books/_search

{

"query" : {

"filtered": {

"query": {

"match": {

"title": “lazy dog"

}

"filter": {

"range": {

"price": {

"gte": 5

}

# 참고 URL

https://www.elastic.co/guide/en/elasticsearch/guide/current/structured-search.html

5. Analysis의 이해

# Analysis는 tokenization + token filters 이다.

#____________________________________________________

# 토크나이저와 필터링

GET /library/_analyze?tokenizer=standard

{

“Brown fox brown dog"

}

GET /library/_analyze?tokenizer=standard&filters=lowercase

{

“Brown fox brown dog"

}

# 토크나이저와 필터링 결과에서 유일한 결과만 얻기

GET /library/_analyze?tokenizer=letter&filters=unique,truncate

{

“Brown brown brown fox brown dog"

}

#____________________________________________________

# A tokenize + 0 or more token filters == analyzer

GET /library/_analyze?tokenizer=standard

{

“Brown fox brown dog"

}

#____________________________________________________

# Analysis를 이해하는 것은 매우 중요하다.

# 왜냐하면 토큰나이저는 다양하게 쓸 수 있기 때문이다.

GET /library/_analyze?tokenizer=standard&filters=lowercase,kstem

{

"THE quick.brown_FOx Jumped! $19.95 @ 3.0"

}

GET /library/_analyze?tokenizer=letter&filters=lowercase

{

"THE quick.brown_FOx Jumped! $19.95 @ 3.0"

}

# 참고 URL

https://www.elastic.co/guide/en/elasticsearch/guide/current/_controlling_analysis.html

6. Mappings overview (schema 정의)

GET /library/_mapping

#____________________________________________________

# 새로운 필드 추가

PUT /library/books/_mapping

{

"books": {

"properties": {

"my_new_field": {

"type": "string"

}

GET /library/_mapping

#____________________________________________________

# analyzers, etc

PUT /library/books/_mapping

{

"books": {

"properties": {

"english_field": {

"type": "string",

"analyzer": "english"

}

GET /library/_mapping

#____________________________________________________

# 필드의 타입과 분석방식을 변경하려면 에러발생!

PUT /library/books/_mapping

{

"books": {

"properties": {

"english_field": {

"type": "long",

"analyzer": "cjk"

}

# 해결방법 - https://gist.github.com/nicolashery/6317643

#____________________________________________________

# long타입의 필드에 double 데이터가 저장된 경우

# 문제를 해결하는 방법

POST /logs/transactions/

{

"id": 234571

}

POST /logs/transactions/

{

"id": 1391.223

}

GET /logs/transactions/_search

{

"query": {

"filtered": {

"filter": {

"range": {

"id": {

"gt": 1391,

"lt": 1392

}

GET /logs/_mapping

GET /logs/_search

# 문제해결 참고 URL

# https://www.elastic.co/guide/en/elasticsearch/guide/current/analysis-intro.html

NoCoDe's STORY

2015년 8월 17일 월요일

Elasticsearch 시작하기 (Elasticsearch Getting Started Webinar)

Elastic 제품 포트폴리오 (The Elastic Product Portfolio)

0. Elasticsearch의 특징

댓글 없음:

댓글 쓰기