3

I am scraping bing search results using node and cheerio. I need to grab all the href values from two lists that have different IDs.

  1. How can I grab all the tags from both these lists in one statement? I tried but it didn't work.
  2. From the first list, I do not want the li tags with the class "b_pag". How can I write a selector for it? Like a Not condition or something.

$("a", ["#b_content", "#b_context"]).each((index, element) => { const href = $(element).attr("href"); links.push(href); });

Refer to the attached screenshot for the html.html

Update2 : I was wanting to ignore the whole <li class="b_pag"> tag, but the solutions I found here and elsewhere ignored just that tag. Any other <li> tag under it, which has any other or no class, does not get ignored.

I found a way around it. I could grab the <li> tags that have other class names. Check out the html here. I am thinking of using four different selectors for the first four classes. Like $(.b_algo) or $(.b_ans). But how can I grab the other two <li> tags that have multiple classes associated with it? I could not get a clear idea from the cheerio docs. Hope I am clear enough for you guys! Something like $(.b_ans b_mop) didn't work. Nor did $("li[class=b_ans b_mop").

4
  • 1
    does all the li hold a <a> tag? check out my work around. jsfiddle.net/apmnky0b
    – Zaza
    Commented Feb 27, 2019 at 6:07
  • yeah all li tags have a tags along with other tags
    – Rohit
    Commented Feb 27, 2019 at 6:25
  • I saw your code. It gets all the a tags under ol with b_content as class. I want to ignore the li tag that has "b_pag" as class. See the last li in the screenshot. That is what I want to ignore. All the other li tags are needed.
    – Rohit
    Commented Feb 27, 2019 at 6:28
  • 1
    check out this code jsfiddle.net/roftsnap
    – Zaza
    Commented Feb 27, 2019 at 6:54

3 Answers 3

3

Try this,

$("#b_content", "#b_context").each(function(i, elem) {
        array[i] = {
             a: $(this).find("a").attr("href")
         };
      });`

To select "li" except class "b_pag" use, li:not( .b_pag )

7
  • $("#b_content", "b_context") will find b_content under b_context, which doesn't exist, so I get an empty result. the "li:not(.b_pag)" is what I was looking for. Could you tell me where exactly would you write it in the code? I can grab the <ol id="b_results>and then find("li") as follows - $("#b_results").find("li").each((i, el) => { // something }); Then where should I insert the not condition?
    – Rohit
    Commented Feb 27, 2019 at 6:40
  • 1
    Here you go, $("#b_results").find("li:not(.b_pag)").each((i, el) => { // something }); Commented Feb 27, 2019 at 10:48
  • I found a way around that problem. I have another doubt though. how can I grab an element like this - <li class="a b c"> ? does this mean that this element is associated with 3 classes (a, b and c)?
    – Rohit
    Commented Mar 5, 2019 at 5:59
  • 1
    Yes ! Similar to some bootstrap classes <a class="btn btn-default btn-block">..</a> Commented Mar 5, 2019 at 10:01
  • 1
    The answer to this question has been answered before, check this out. link Commented Mar 6, 2019 at 6:00
1

Try this one

$(".b_content li[class!='b_pag']").find("a").each((index, element) => { const href = $(element).attr("href"); console.log(href); });

if you want to ignore the class use the attribute selector with respective tag like this li[class!='b_pag']

1
  • The != does work and the li:not() as Dipesh Lohani said above also ignores the <li class=b_pag. But now what is happening is that there are other <ul> and <li> tags under <li class=b_pag>. They are not being ignored by this statement. Is there any way to ignore the entire <li class=b_pag> tag even if there are other <li> tags under it with some other class? for example - <li class="b_pag"> <!--something--> <ul> <li></li> <li></li> <li></li> </ul> </li> I need to ignore the whole <li class="b_pag">
    – Rohit
    Commented Feb 28, 2019 at 6:00
1

Try using Bing Web Search API instead: https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/

It is the legal and better way to get Bing Search Results. You can sign up for free tier of this API, if you do not have lot of searches to do. You can also use the Azure free credit, that you receive when you join Azure.

1
  • Thanks bro. I will surely check it out :)
    – Rohit
    Commented Mar 1, 2019 at 9:59

Not the answer you're looking for? Browse other questions tagged or ask your own question.