Everything is moving towards TLS encryption all the time and monitoring those certificates generated is very important for multiple reasons:
- Detection of malicious certificates
- Monitoring of mistakes
An example of why certificate transparency is important is the incident where Symantec generated certificates for a google.com domain however those certificates were never actually requested by Google.
More details on the event here.
Fortunately, Google caught those malicious certificates by using Certificate Transparency logs.
If you want more details on Certificate Transparency I strongly urge you to read more on their website.
The basic principle is for every certificate generated by Certificate Authority (participating in CTS) a reliable log is generated that can be inspected.
This allows domain owners to search for new certificate entries, monitor any events and detect any malicious activity.
Indexing CTS into Hydrolix
There are several ways to get CTS entries, the easiest in my opinion is to use the CertStream:
https://certstream.calidog.io/
Certstream generates a websocket server which you can use to fetch all the logs entries, they provide multiple libraries in Python, Go, Javascript and Java to connect to the websocket and retrieve the logs.
All those library are open source and available in GitHub: https://github.com/search?q=org%3ACaliDog+certstream
In this example I’ve used a python script to connect to the websocket and then generate bulk of 1000 entries to POST to Hydrolix:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
#!/usr/bin/python3.8 import logging import sys import datetime import certstream import requests import json def print_callback(message, context): if (message['message_type'] == "certificate_update"): url = 'https://hostname.hydrolix.live/ingest/event' headers = {"X-HDX-Table" : "sample.cts", "X-HDX-Trasnforn" : "transform_cts", "Content-Type" : "application/json"} global buffer global data buffer = (message) try: data except NameError: data = [buffer] else: data.append(buffer) if len(data) == 1000: r = requests.post(url, headers=headers, data=json.dumps(data)) print(r.status_code) data = [] logging.basicConfig(format='[%(levelname)s:%(name)s] %(asctime)s - %(message)s', level=logging.INFO) certstream.listen_for_events(print_callback, url='wss://certstream.calidog.io/') |
In this example I’m generating a POST request into my Hydrolix cluster on the table sample.cts
using the transform transform_cts
.
You can find the Hydrolix transform example associated with Certificate Transparency logs in our gitlab:
https://gitlab.com/hydrolix/hdx-tools/-/blob/master/vscode/cts.http
You can read the blog post on how to use VSCode with Hydrolix to use this transform.
Viewing CTS data
Hydrolix provides several tools to view data, for this example we have used Superset.
The dashboard contains 2 separate tabs, the first shows the number of certificates generated over time and the percentage of CA Cert Issuer and the total number of certificates.

The second view allows you to see the details of the Certificate logs:

On the left you have filter that applies to the overall dashboard, it includes time range, filtering on Certificate Authority issuer and and a specific domain pattern, by default we look for *domain*
.
If you would like to inspect this yourself we have a demo available you can use:
https://superset.hydrolix.io/
We’ll keep indexing the certificate transparency logs so you can use the data all the time and search with no limit!