Was there any recent change to the data that search engine robot relied upon

image
I just noticed that Google search result displays reference number in addition to the text. It seems to only affect Bhikkhu Bodhi’s translation, but not Bhante Sujato’s translation. Here’s another example:


Was there any recent change to the data that search engine robot relied upon?

1 Like

Ven. @snowbird is this your department?

2 Likes

Hmm. I’m not seeing that This is Google in FF on Windows in NZ:

It doesn’t show snippets at all, just the blurb.

But to answer your direct question, I don’t think there were any changes that would affect this. The difference between Bhante Bodhi’s translation (legacy) and Bhante Sujato’s (alligned) is that the legacy texts are more like hard coded html. I don’t know if that is what’s causing the difference.

If I search for a string of text, this is what I get:

1 Like

That is weird. The reference number doesn’t show up if I use chrome to search.

Can you repeat your search using Firefox, Ven @Snowbird ?

1 Like

Never mind, you used Firefox. I didn’t read carefully.

1 Like

I kind of guess the cause. Please correct me if I’m wrong.

Having turned off all references, I check DOM (whatever the correct technical term is), I don’t see any references there in the DOM.

Now, if I enable references and check the DOM again, there will be an additional span class="references"in the DOM.

Now, the examples above show the differences in the DOM for Bhante Sujato’s translation. By default, it doesn’t include any reference in the DOM. Thus, when the Google’s robot crawled to swallow the data, it wouldn’t capture any references.

Having said that, let’s compare the DOM for Bhikkhu Bodhi’s translation. Let’s see how the DOM looks like with references disabled.

The references are embedded in the default DOM whether I enable/disable references. I suspect this is the cause of the issue. References are included in the default DOM for legacy translations. To verify my assumption, I ran a search on another legacy translation. Bhante Suddhaso’s legacy translation has the same issue:

To fix this issue, I think the default DOM must not include any reference.

Spoken too soon. Chrome suffers from the same issue.

I think you are correct in your diagnosis, and I think DOM is the correct term.

The legacy texts, as I said, are just hard coded html pages. It appears that for them the refs are turned on and off just by css.

It would be up to Bhante @Sujato, but I’m guessing it’s not worth re-doing how the legacy texts are handled in order to have slightly cleaner search result snippets.

Thanks for digging into this!

1 Like

Thank you, Bhante.

I wish I can help but my web development skills is from the era of php and perl (thinking of perl makes me want to puke​:nauseated_face::face_vomiting:).

I’ve heard php is making a comeback, but I don’t think there is a line of it in SuttaCentral. I believe everything is js and Python.

Personally, in the grand scheme of things, I don’t think it’s a big deal to have the ids there. Also, Google is constantly adjusting what they do, so it may resolve itself. That, and this is the kind of thing that maybe pulling on a single thread might cause other things to unravel. :face_with_spiral_eyes:

1 Like