# Daily Hack #day79 - Indexing DynamoDB Items to Elasticsearch using AWS Lambda

### Indexing DynamoDB Items to Elasticsearch using AWS Lambda

Indexing DynamoDB items to Elasticsearch can be a powerful way to enhance search capabilities for your application. By integrating AWS Lambda, you can create a seamless and real-time indexing solution that updates Elasticsearch whenever there are changes to your DynamoDB table.

### Key Components:

1. **DynamoDB**: A fast and flexible NoSQL database service for any scale.
    
2. **Elasticsearch**: A search engine that provides full-text search capabilities.
    
3. **AWS Lambda**: A serverless compute service that runs code in response to events.
    
4. **DynamoDB Streams**: Capture changes to items in DynamoDB tables and forward them to AWS Lambda for processing.
    

### Steps to Set Up Indexing:

#### 1\. Enable DynamoDB Streams

First, enable DynamoDB Streams on your DynamoDB table. This will capture changes (insert, update, delete) to items in the table.

* Go to the DynamoDB console.
    
* Select the table you want to index.
    
* Click on the "Manage Stream" button.
    
* Enable the stream and choose the "New and old images" option to capture both the old and new item images.
    

#### 2\. Create an AWS Lambda Function

Next, create a Lambda function that will process the stream records and index them to Elasticsearch.

* Go to the AWS Lambda console.
    
* Create a new function.
    
* Choose a runtime (e.g., Python, Node.js).
    
* Set up the Lambda function with the following code (this example uses Python):
    

```python
import json
import boto3
from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

# Initialize Elasticsearch client
region = 'your-region' # e.g., 'us-west-1'
service = 'es'
credentials = boto3.Session().get_credentials()
aws_auth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

es = Elasticsearch(
    hosts = [{'host': 'your-es-domain-endpoint', 'port': 443}],
    http_auth = aws_auth,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection
)

def lambda_handler(event, context):
    for record in event['Records']:
        if record['eventName'] == 'INSERT' or record['eventName'] == 'MODIFY':
            # Extract new image
            new_image = record['dynamodb']['NewImage']
            document = {
                'id': new_image['id']['S'],
                'name': new_image['name']['S'],
                'description': new_image['description']['S']
            }
            es.index(index='your-index-name', doc_type='_doc', id=new_image['id']['S'], body=document)
        
        elif record['eventName'] == 'REMOVE':
            old_image = record['dynamodb']['OldImage']
            es.delete(index='your-index-name', doc_type='_doc', id=old_image['id']['S'])

    return {
        'statusCode': 200,
        'body': json.dumps('Successfully processed records')
    }
```

Replace `'your-region'`, `'your-es-domain-endpoint'`, and `'your-index-name'` with your actual AWS region, Elasticsearch domain endpoint, and the index name you wish to use.

#### 3\. Add Permissions to Lambda

Ensure your Lambda function has the necessary permissions to access DynamoDB streams and Elasticsearch.

* Attach an IAM role to your Lambda function with permissions for `dynamodb:DescribeStream`, `dynamodb:GetRecords`, `dynamodb:GetShardIterator`, `dynamodb:ListStreams`, and `es:ESHttpPost`.
    

#### 4\. Configure DynamoDB Stream as Event Source

Link the DynamoDB stream to your Lambda function.

* Go to the DynamoDB table details.
    
* Under the "Triggers" tab, add a new trigger.
    
* Select the Lambda function you created.
    
* Save the trigger.
    

### Testing and Monitoring

* **Test Your Setup**: Insert, update, and delete items in your DynamoDB table and verify that the corresponding changes are reflected in Elasticsearch.
    
* **Monitor Logs**: Use Amazon CloudWatch to monitor your Lambda function logs for any errors or issues.
    
* **Performance Tuning**: Adjust the batch size and concurrency settings for your Lambda function based on the expected load and performance requirements.
    

### Benefits:

* **Real-Time Indexing**: Ensures that your Elasticsearch index is always up to date with the latest changes in your DynamoDB table.
    
* **Scalability**: Leverages AWS Lambda's ability to scale automatically in response to DynamoDB stream events.
    
* **Serverless**: Minimizes infrastructure management by using AWS managed services.
    

By following these steps, you can set up a robust and scalable solution for indexing DynamoDB items to Elasticsearch using AWS Lambda, enhancing your application's search capabilities.
