Technical SEO for a platform of 5 million users

If a user comes to your website to look at nude pictures but the first thing he sees is a popup article about recycling garbage – he is more likely to Google an alternative than to scroll down… from what I hear.

This is just one of the many stats (told in a different manner) that Google tells you for free. It tells you that in order for you to implement a better UX, therefore obtaining longer user retention, therefore ranking your website better, therefore directly affecting positive SEO scores.



Aca Brkić is frontend developer that dug too greedily into the mines of Adobe software, SEO and PHP frameworks thus releasing the bane of worrying too much if an image is wrongfully named and in a different format. Enjoys proving that every browser based online game can be beaten via console and inspecting HTML and JS variables. Sometimes likes to work from 2am till the morning.

– Frontend Developer – eJobs team


The platform on which my colleagues and I are working does not offer nudity, but offers something equally wanted for by you and your environment – job opportunities and employers. More specifically, it offers jobs and candidates in an EU country that is Romania. If you want to check it out you can at (or from app stores, we use capacitor.js to convert it to native). Besides having 5 million registered users, our little app has about 700 thousand page views per day (that is ~3 million unique visits per month) with an average stay of about 5 minutes. That is a lot. Before we dive into tech details, a little bit of philosophy. In order for someone to stay longer than a few seconds, the website must meet some basic requirements (next to the content itself). In reality they are quite generic and valid for all niches of businesses:

  • The webpage must load quickly. It’s become part of the human reflex to drag the tab and close it if the page is ‘white’ with that rotating spinner thingy because the user gets bored and is not interested in wasting time on waiting for it to finish loading.
  • The meaningful content is the first thing you see when it loads. Or clearly leads you to it. If you are looking for a car, you expect an ad with a price and pictures. If you are looking for a programming solution, you expect to see the question and answer.
  • The website must have an appropriate and responsive design. Zoomed out gray tables with an 8px font require the user to do additional steps – merging the fingers, unmerging, moving left/right. Who has time for that kind of input in the time needed for you to figure out if you are in the right place? And not less important – older design trends also mean older content, and users today want the latest intel.
  • The programming errors must not be noticeable. Insecure certificates (that red lock in the URL of the browser), antivirus prompts on another window etc, all convince the user to leave that website ASAP.

Even if you have all those bullet points checked, there are still users who close the tab after a few seconds, which in our case means that there are those who stay for 10+ minutes. So, the task is hard.


In other words – you can have a high ranking (top place) website even without the always mentioned backlinks and other non-tech offpage stuff everyone is blabbering about.


Five years ago I started working at a pharmaceutical company, in the digital sector. The job was developing websites/apps for every product they made. Cough syrups, weight loss supplements, erectile helpers etc. Competition in that area is huge! The experience gained there, while working on (and succeeding on) getting our sites to #1 of the search results for words such as ‘cough’, ‘vitamins’, ‘slime’ and the likes was priceless. Turns out, there is a whole lot more to do on the frontend side than just styling content as you would imagine. Inspecting competition’s websites, in particular those that are well ranked, showed us that simple things such as differences between using <bold> or <strong> elements (that are visually the same) or having keywords in the ‘right’ place of the URL impact the final results in a way more meaningful than the eye can see. In other words – you can have a high ranking (top place) website even without the always mentioned backlinks and other non-tech offpage stuff everyone is blabbering about. A lot has changed in these 5 years regarding search algorithms, but using knowledge gained from these examples boosted our scores sky high.

In order for someone to stumble upon us or be referred to, there are a few other checks to be checked. One of those is the domain name – if it is closely related to the content you are offering that is a big plus. This check is not mandatory, but it offers a big push away from the competition. To put it in layman terms, bots think like: if you are searching for ‘jobs’, there is a big chance you will find them at ‘ejobs’ as the first hint.

Of course, the biggest (non)technical item to have is content. Prompt and rich content. Most banal example of this in practice is if you search for a phrase that is a part of a recent news article – chances are the top hits will lead you to news portals that uploaded that article today. Basically, Google knows those portals are prompt (modern, up to date), and that users come there to find something that is currently happening. Similar thing is happening for jobs. Upload and expiry dates make both users and bots follow that content more frequently and with more focus. Especially when there is a lot of it.


In come the bots


For search bots, the criteria is a little bit different. Besides speed and keywords in text, they also want to check semantics and code validity.


So far we spoke about ‘regular’ users but there is a focus group that we are also trying to impress constantly called bots. For search bots, the criteria is a little bit different. Besides speed and keywords in text, they also want to check semantics and code validity. They stare at the little things like adding alt tags for images, sequentially descending headers (h1-h6 tags), using proper elements and attributes for the given purpose (lists, aria labels, noopener for external links etc), meta data, and go as far as checking if the contrast is good enough between foreground and background colors of text. You would be amazed at how much of this is ‘forgotten’ to be done on the web. Heck, the more I write – the more examples of unfinished business on our site I come upon.



How do bots grade this? If we strictly speak of on-page tech – using web vitals. They are a combination of various metrics including loading speeds, different time points in interactivity, size, etc. Of course, as tech evolves so do the metrics, so the definitions sometimes change. And if I wasn’t clear, when I say bot – I mean Google bot. Rest follow practice.

We have a bunch of external scripts. Mountains. They are all there for analytics. In order for them to be ‘controlled’ by the marketing/analytics sector, most are implemented via Google Tag Manager, because it offers a CRM like experience for them (no coding required). Of course, an async attribute is put on every single one of them (which should load them without blocking user input and page load) but they still remain the biggest (if not only) culprit for the bad aspects of the site performance. With a tear in my eye, I must admit we have not found a cure for this. No matter how late or async they load, no matter the way they load in, no matter the domain they are coming from – they will most certainly negatively affect the performance. Then again, if we rely on the privilege of knowing every detail of our audience (those that they accepted to share), then we accept the consequence of the way to that point.


How to know if your frontend is doing well


A rocky and tedious road has been taken to get to the point of only having external analytics as an unresolved little problem. Starting with the previously mentioned HTML babies, then headbanging on the refresh button on GTMetrix, Page Speed Insights and other tools, all the way to major redesigns of the entire platform. Speaking of scores, they can be drastically different depending on the tool you are using. For instance, Chromium browsers have a built-in Lighthouse tool for testing vitals. It is great for pointing out items to check, but uses a lab environment – which means much depends on the device you are using (for instance, browser extensions disturb the score, cache memory, even the OS state). Most of those items say ‘does not directly affect SEO’ but that should be considered a lie. The better vitals are, the better SEO is – and that is the only universal truth out there. Every time you fix a bigger item or a few smaller ones, the organic user number improves. But, even though we programmers love to see green 100s on the score sheet, lab environment is not enough.


What is definitely the most important score to look at is located at Google Search Console. Why? Because it takes in account the real user journey on the platform. That means using those same metrics, but looking at them on the actual user’s devices while they are navigating from page to page.


One step further from Lighthouse is Page Speed Insights. By design, they are pretty much the same. Only difference is that tool also tests it from their own environments. Kind of imitating user experience. But even that tool does not give you a completely reliable reference. For instance, we use Cloudflare for one step of security and many testing tools cannot get past it (or cannot load some specific resources). What is definitely the most important score to look at is located at Google Search Console. Why? Because it takes in account the real user journey on the platform. That means using those same metrics, but looking at them on the actual user’s devices while they are navigating from page to page. So what was a slow load on the first page, will be instantaneous on the other, drastically improving the scores.



If you look at our console results, you will see thousands of pages. But navigating through the platform, maybe less than 10? The answer is filters. Every filtered search (most importantly the popular ones) have their own URL. Each combination of filters has their own URL. This is a major SEO boost. First thing a bot looks at (even before stepping foot on the site) is the URL. If you have a keyword in it, most likely you have the advantage over the one that does not. To have ~35k valid URLs on this tool is a result worthy of glory. Changing small things as well as big logic components then looking at the green scores after is a special feeling.


Starting to see where the problems lie


Most of the crucial intel we got on what’s important to optimize was during a bigger redesign of the site. Hundreds of changes, but what really got into the eye is setting up gradient backgrounds on every page. Easy enough, right? Wrong. Gradients in vector drawing software are made through complex mathematical operations and as such the exported files have an enormous size. The subtle crossings between dark purple to a little less dark purple over to orange is not something particularly observable to a developer. But it is to the designers, and they are ready to fight for them. Since they are not classical ellipses, it was not practical to use CSS radial gradients. The combination of 100 horizontal and vertical half visible rotated gradient ellipses would tire even a well oiled browser, so it is easier to export a raster image. It was at this moment he knew…



Remember those metrics? They really like to go all red on you if the resources (aka images) are huge in file size. Because when that is the case, Largest Contentful Paint takes longer to load and First Input Delay takes longer to fire. As the gradients are located at the top of each page, the LCP element is always the gradient one (the one you first see on page load, the one that is the largest on the visible part of the page), meaning it is of the utmost importance to have it optimized. Then we started researching page load speeds and order of resources, all because of the darn gradients.

About the size – a simple png export from Figma was about 10mb. Free online tools for optimizing are not in big love with gradients, so even if they make a webp of 100kb out of it, the quality loss is obvious and no one will approve it. Only thing to do was bring back Illustrator knowledge from college and play around with it in combination with those optimization tools. First thing we could remove was a bunch of shades of color, because the human eye does not need to have 8 pixels atop of each other with different opacities. There are ways of telling the vector software to have it merged to 1 pixel of the final color. After days of playing around, combining those online optimizers, we got to an acceptable quality and a magnificent size of 16kb! And if that wasn’t enough, we also cropped it for mobile use to a staggering 8kb! In order for them not to load both at once (even though they are small, every server call is ‘expensive’ in terms of time), using media queries is enough to ensure the load call does not happen based on the user agent. This pretty much ‘greenlit’ the FID problem, but the LCP one remained. As it would turn out, it was the most crucial one of all, because it is connected to other parts of the loading flow. In most cases, the LCP element had some interactive part in it like a search form. Which means it is directly connected to other metrics related to user inputs, and the time it takes for it to become available to them. As the LCP does not happen until the gradient is loaded, we somehow needed to tell the browser that that image is of bigger meaning to us. In comes – preload.



HTML preload links should make the browser download the resource even before starting the painting of the webpage. If we go in the network panel, we will see just that – the first resource to be downloaded is the gradient image. All attributes in the screenshot above are needed in order for this to work properly. Including the media query for the same reasons as before. By using this, we not only improve our LCP score, but also others like FID, Time To Interactive and others. Considering this is a Single Page App, the download is required only once so there is no time delay needed on page navigation, but also on the next time the user loads it it will be loaded from cache as it is a static resource. This is a perfect example of what tools cannot see, and Search Console can.



The next metric is Content Layout Shift. Also known as ‘I swear I clicked here but it jumped 20px down at that moment’. This happens because another element loads before (or maybe even after) him, causing the whole layout to shift. Before we move on, let’s talk about server/client side. Our app uses Nuxt.js for Server Side Rendering, which means that most of the HTML in our layout is served directly to the browser. So, not an empty page and then JS does all the work, but an already completed page ready for interaction. This also means that the bots can crawl/index/whatever without waiting for various JS scripts to occur, enabling it to navigate through the other pages as well. Of course, as frontend evolved, bots have learned to wait for it regardless, but they much prefer to be handed final content without them having to move a bat.


Lazy loading is not only for images


As our most searched for pages have a lot of content that is located ‘below the fold’ (you need to scroll to see it) we implemented lazy loading. The term that is most commonly used for images meaning they will load only when coming into the viewport of the user. However, this technique can and should also be used for ‘normal’ elements, because components themselves affect the loading time during which the user is ‘blocked’ from any interaction (like clicking or scrolling). Once they are loaded, you guessed it, the CLS occurs and the button you wanted to click is suddenly 80px below you and you clicked on an annoying ad instead. That blocking time occurs because JS takes over control for those HTML elements, carefully listening to changes and events that are happening – this is called hydration. In the old times, you would need an external script for implementing lazy loads on Vue (or any other) components. Today, all you need is a small code block using the Intersection Observer API. In short, every browser has it and it is used to tell us if something is in or close to the viewport. In order to keep that bot happy, the keyworded text must be there. In our case, on pages where we have 100+ job ads, we serve the keywords (job title, company, any info that makes users come to us) in plain text so the loading time is cut. Only after scrolling does that plain text transform into an aesthetically pleasing component, ready for interaction because we passed the point of the metrics (initial page load and others). In other words, time needed for user interaction and page load is way down because there is no hydration to occur on initial loads. Worth noting that this practice only works if the plain text and text in components is the content users are there for. Any other case would mean spam and negative rankings.



So many text, so many places to trip upon


Most, if not every, element on our pages has some text in it. This means fonts are responsible for a part of every single metric out there.


I previously mentioned that because of gradients, we learned of other loading impacts on the site. One major one is the font. As you may know, woff2 is the preferred format nowadays as it is supported by every meaningful browser and is smaller in size. Nice looking fonts normally have different files for their variations (bold, italic, bolder, regular…) and so is our case. Most, if not every, element on our pages has some text in it. This means fonts are responsible for a part of every single metric out there. Speaking of CSS, there are a few strategies that can improve vitals. Each one has its downside. Flash Of Unstyled Text is when the browser shows the text in a generic font and then replaces it with the downloaded one when it is ready. Flash Of Invisible Text is when the text is invisible (although taking the required space in the layout) until the downloaded font is ready. Turns out, FOIT is a better option for CLS as it remains in the reserved space, without causing jumps around it. Only problem is, if the font takes too long to load, then either the generic font causes the jump or the invisible text remains for some more time. Solution that worked best for us is using font-face: fallback with a preload combo. This means there is a very small time period for FOIT if it is needed, but we assure that font is loaded first so the change happens instantly. The logic is the same as for the gradient, but with a few caveats. We only preload some font files, as not all are needed for the LCP or other large elements. The order of preload links is important. This is something not explained enough on the web, but for some reason the image needs to go before the font file, otherwise one or another will download normally. What is explained enough is that fonts need to have a crossorigin attribute on it as well. Need mentioning that the browser does not allow more than 5 preload links. We use local font files, because calling a CDN for a 19kb preload alternative is not effective as the call itself will take time.




Things too generic to put into a long article


Regarding other images, static ones should be optimized via easy to use free online tools, as they can take down 90% of file size without quality loss. Smaller images without many colors or complex shapes should be used in inline SVG. This is particularly useful for hamburger menus, header logos and icons in lists. Because of that SSR, the browser will be ready to render them as much as ‘regular’ HTML elements, reducing the need for resource downloading and also causing the CLS to disappear. Optimizing SVG code is used on the SVGOMG tool. This is particularly useful because of the ‘Excessive DOM size’ item. That tool ‘crops’ the svg code into single paths, reducing the final DOM size to a light number of elements (sometimes even 1 if the image can be drawn from 1 line).

Regarding JS, the biggest move in the right direction was achieved via lazy loading. Regarding SSR, it will cause a small delay on the first load, but this is compensated by the lack of delays on next load (because of cache) and by the lack of delays on navigation (because of Nuxt rendering components on other routes asynchronously). Regarding CSS, everything than can be done in it – do it in it. Animations, hovers, transitions, all are better done in CSS performance-wise. To that matter, try to use the same component in a responsive manner instead of creating different ones for different device sizes.

Carefully read what the metrics tool tell you. In most cases, you are capable of solving problems even without googling, as most are just related to stuff you were too lazy to fix. When you solve one small item, you get two bigger ones, but when you solve those two then you automatically fix another three. The happiness a developer feels when the graphs go from red to green is invaluable.