How to Add Search to Your Static Site Generator (Jekyll, Hugo, Gatsby, Nikola, etc.)
I do love static site generators. Before Jekyll, I used Wordpress and if I forgot to babysit the platform, theme, and plugin updates for a few weeks or months, it was almost guaranteed that it would get hacked at some point. Once I moved to Jekyll, I never had that problem.
But there is also some downsides to static sites. For me the biggest ones have been search and contact forms and the fact that I don’t want to pay for a third party service. Comments used to be on this list, but to tell you the truth, I don’t miss moderating all the spam.
I have tried to implement search a few times, had no luck with Simple Jekyll Search, and knew that Google didn’t index every page on my site, so using a custom search wasn’t worth it, so for years, I just didn’t have search. Whatever, it wasn’t one of my top priorities.
But last week I found a simple solution that searches every part of my site, without a third party service. And it was easy to set up. While set up this search for Jekyll, it should work for other static site generators. The only part that will have to change is the template to build the json in the search.md
file.
What is Lunr.js?
It is what made this possible. The developers call it “A bit like Solr, but much smaller and not as bright.” Lunr.js is a lightweight JavaScript library that will search through data that is stored in a variety of formats including JSON, HTML, and XML. It can also search through data stored in a variety of locations, including the browser, a database, or a file.
The key features we will be using for searching static sites like Jekyll are:
- The ability to do full text search in JSON
- The ability to search in the browser
Lunr.js can do more things, but I didn’t really dig into them. The fact I could do full text search in the browser without a service was enough for me.
Creating the Static Site Search Results Page
This is the page that will generate the search results for you. After I saw how it worked, I kicked myself for not doing this in the first place. It essentially has a template in it that turns the content of the whole site into JSON. To tell you the truth, I thought this could be possible, but I wasn’t quite sure how huge the file could get.
And then I thought of modern frontend apps. Didn’t even have to go there, because I really overestimated how much content I have on my blog. The file generated by this template, that contains every word published on my site is only 1.4MB. Yes, big, but considering I use very little other JavaScript on my blog, nowhere near the size of some sites. ChatGPT, where all of the magic happens on the server, uses 7MB of JavaScript.
This file is the only part of this setup that will have to be modified for static site generators other than Jekyll, because of differences in templating languages. I also included a file that should work with Hugo also. All you really have to do is use the templating engine in your static site generator to generate a JavaScript object named window.store
with the title
, author
, category
, content
, and url
of each post.
Jekyll Lunr.js Search Results Page
Here is the search.md
page for Jekyll. Remember to replace the layout name with a layout from your template:
---
layout: page
title: Search Results
---
<!-- List where search results will be rendered -->
<ul id="search-results"></ul>
<script>
// Template to generate the JSON to search
window.store = {
{% for post in site.posts %}
"{{ post.url | slugify }}": {
"title": "{{ post.title | xml_escape }}",
"author": "{{ post.author | xml_escape }}",
"category": "{{ post.category | xml_escape }}",
"content": {{ post.content | strip_html | strip_newlines | jsonify }},
"url": "{{ post.url | xml_escape }}"
}
{% unless forloop.last %},{% endunless %}
{% endfor %}
};
</script>
<!-- Import lunr.js from unpkg.com -->
<script src="https://unpkg.com/lunr/lunr.js"></script>
<!-- Custom search script which we will create below -->
<script src="/js/search.js"></script>
Hugo Lunr.js Search Results Page
I don’t have a Hugo site, but this should work as a search results page for this static site generator:
---
title: Search Results
---
<ul id="search-results"></ul>
<script>
window.store = {
{{ range where .Site.Pages "Section" "blog" }}
"{{ .Permalink }}": {
"title": "{{ .Title }}",
"author": [{{ range .Params.authors }}"{{ . }}",{{ end }}],
"category": [{{ range .Params.categories }}"{{ . }}",{{ end }}],
"content": {{ .Content | plainify }},
"url": "{{ .Permalink }}"
},
{{ end }}
}
</script>
<!-- Import lunr.js from unpkg.com -->
<script src="https://unpkg.com/lunr/lunr.js"></script>
<!-- Custom search script which we will create below -->
<script src="/js/search.js"></script>
Creating the Static Site Search Element
This is the search form. There is nothing special about it. It just posts the query to the \search\
url. In my Jekyll installation, I just created a search-box.html
file in the _includes
folder and then included it in the header template of my blog with {% include search-box.html %}
. You can do something similar with whatever static site generator you use or simply add it directly to your main layout.
<div class="header-search">
<form class="header-search-form" action="/search/" method="get">
<input type="text" id="search-box" name="query">
<input type="submit" value="search">
</form>
</div>
Creating the Static Site Search Script
This script is what is included in the search.md
file with the <script src="/js/search.js"></script>
tag. It uses Lunr.js to search the JSON we generated in that page from the content of the site.
(function() {
function showResults(results, store) {
var searchResults = document.getElementById('search-results');
if (results.length) { // If there are results...
var appendString = '';
for (var i = 0; i < results.length; i++) { // Iterate over them and generate html
var item = store[results[i].ref];
appendString += '<li><a href="' + item.url + '"><h3>' + item.title + '</h3></a>';
appendString += '<p>' + item.content.substring(0, 250) + '...</p></li>';
}
searchResults.innerHTML = appendString;
} else {
searchResults.innerHTML = '<li>No results found</li>';
}
}
function getQuery(variable) {
var query = window.location.search.substring(1);
var vars = query.split('&');
for (var i = 0; i < vars.length; i++) {
var pair = vars[i].split('=');
if (pair[0] === variable) {
return decodeURIComponent(pair[1].replace(/\+/g, '%20'));
}
}
}
var searchTerm = getQuery('query');
if (searchTerm) {
document.getElementById('search-box').setAttribute("value", searchTerm);
// Initalize lunr.js with the fields to search.
// The title field is given more weight with the "boost" parameter
var idx = lunr(function () {
this.field('id');
this.field('title', { boost: 10 });
this.field('author');
this.field('category');
this.field('content');
for (var key in window.store) { // Add the JSON we generated from the site content to Lunr.js.
this.add({
'id': key,
'title': window.store[key].title,
'author': window.store[key].author,
'category': window.store[key].category,
'content': window.store[key].content
});
}
});
var results = idx.search(searchTerm); // Perform search with Lunr.js
showResults(results, window.store);
}
})();
Conclusion
That’s really about it. I use pretty basic html to generate the search results in search.js
. You will probably want to adjust that to your styles. Also the search form uses classes pulled directly from my blog and you might want to change that. Here is a search for other posts on my blog that mention Jekyll to show you how far you can take it.
Like I said, I haven’t dug too deep into what Lunr.js can do. Here are the Lunr.js docs if you want to customize your search further. You can also add more fields to search. For example, if your blog uses tags. Or you may want to remove some. You will just have to modify the JavaScript object generated in the template and change the fields you initialized Lunr.js with.
One improvement I am looking into is pre-building the Lunr index. I have found some details on it and deprecated Jekyll plugins. But this would speed up the load time of the search page, since with the code above it generates this index with each and every search, even though the content has not changed.
But it works for me as is and took only about an hour to put into place and deploy. With the code already written for you, it should take about five minutes.