Sometimes, removing URLs from Google is not as straightforward as adding a NOINDEX tag. Many website owners and SEO teams face situations where an old URL continues appearing in Google search results even after implementing the correct SEO directives. This becomes even more confusing when the page is already blocked in robots.txt.
At Dhairvi Solutions, we recently faced a real-world technical SEO issue where an old URL remained indexed in Google despite having NOINDEX, NOFOLLOW implemented correctly. After detailed analysis, we discovered the actual issue was not the meta robots tag itself, but a crawl restriction inside robots.txt that prevented Googlebot from revisiting the page.
This scenario highlights an important SEO concept that many websites overlook: Google must crawl a page before it can process NOINDEX and remove the URL from search results.
Why Google Is Still Showing Old URLs in Search Results
In our case, the URL had been indexed by Google years ago. Later, the website added a robots.txt rule to block URLs containing query parameters (?) to optimize crawl budget and reduce unnecessary crawling.
The robots.txt rule looked like this:
Disallow: /*?
After some time, the page was updated with:
<meta name="robots" content="NOINDEX, NOFOLLOW">
However, the URL still continued appearing in Google search results.
The reason was simple but easy to miss:
Googlebot was blocked from crawling the page again, so it never saw the updated NOINDEX directive.
Why NOINDEX Was Not Working
Many people assume that adding NOINDEX immediately removes a page from Google. But in reality, Google follows a process:
- Crawl the page
- Read the NOINDEX tag
- Update the search index
- Remove the URL from results
If crawling is blocked before Google processes the NOINDEX tag, the URL can remain indexed for a very long time.
This is exactly what happened in our scenario.
How Robots.txt Can Prevent URL Removal from Google
The main issue was not indexing itself. The issue was crawl access.
The URL already existed in Google’s index before the robots.txt block was added. Once the following rule was implemented:
Disallow: /*?
Google could no longer access URLs containing parameters.
As a result:
- Google retained the old indexed version
- Google could not read the new NOINDEX tag
- The URL remained visible in search results
This is one of the most misunderstood technical SEO behaviors.
Difference Between Crawling and Indexing in SEO
Understanding the difference between crawling and indexing is critical for technical SEO.
Crawling: Googlebot visits and reads the webpage.
Indexing: Google stores the page in its search database.
A page can remain indexed even when crawling is blocked. That is why robots.txt does not automatically remove pages from Google.
How We Fixed the SEO Issue
To solve the issue, we temporarily allowed Googlebot to crawl the affected URL again.
We added an Allow rule in robots.txt:
Allow: /catalogsearch/advanced/
Allow: /catalogsearch/advanced/?srsltid=
This allowed Google to:
- Access the page
- Read the NOINDEX tag
- Process the removal correctly
After Google recrawled the page, the URL gradually disappeared from search results.
Best SEO Process to Remove URLs from Google
If a URL is already indexed, the safest removal process is:
Step 1: Add NOINDEX
Use:
<meta name="robots" content="NOINDEX, NOFOLLOW">
Step 2: Keep the Page Crawlable
Do not block the page in robots.txt immediately.
Step 3: Wait for Google Recrawling
Google needs time to process the updated instruction.
Step 4: Verify Removal
Check Google Search Console and search results.
Step 5: Block Crawling Later If Needed
Only after de-indexing is complete.
Common SEO Mistakes That Keep URLs Indexed
Blocking URLs Too Early
If Google cannot crawl the page, NOINDEX cannot work.
Using Robots.txt to Remove URLs
Robots.txt controls crawling, not indexing.
Blocking All Parameter URLs Without Review
Rules like:
Disallow: /*?
can unintentionally create indexing problems for already indexed pages.
Technical SEO Tips for URL Removal
- Always use NOINDEX before blocking crawling
- Review robots.txt rules carefully
- Monitor indexed URLs regularly
- Use Google Search Console URL Inspection
- Check crawl accessibility before debugging indexing issues
Conclusion
If Google is not removing a URL even after adding NOINDEX, the problem may actually be robots.txt.
In our case at Dhairvi Solutions, the URL had been indexed before crawl blocking was implemented. Once Googlebot lost access to the page, it could no longer detect the updated NOINDEX instruction.
This created a situation where:
- The page should not be indexed
- But Google could not process the removal
The key SEO lesson is simple:
Google must crawl a page before it can remove it from search results.
Understanding how crawling, indexing, robots.txt, and NOINDEX work together is essential for proper technical SEO and successful URL removal strategies.
FAQs
Why is Google still indexing my NOINDEX page?
Because Google may be blocked from crawling the page again.
Can robots.txt remove URLs from Google?
No. Robots.txt only blocks crawling access.
Why do blocked URLs still appear in Google search results?
Because the URLs were indexed before the robots.txt block existed.
What is the safest way to remove old URLs from Google
Add NOINDEX first and keep the page crawlable until Google processes it.
Should I block parameter URLs in robots.txt?
Only after confirming Google has removed them from its index.

