Shanghai dragon has also been a year, feeling a little level, should be able to operate their own 1,2 station to see, so he made a late August Taobao, selling gadgets, by the way they practice, as we all know, Taobao and love Shanghai are online trading business, is a kind of competition the relationship between the two, is certainly incompatible, taobao贵族宝贝 don’t bird love Shanghai, shield the spider crawling, so I think of themselves and others, in addition to build a station, and then the promotion, in Shanghai engage in ranking, for my Taobao shop to attract some traffic. Want to do it, at the beginning of the September site is ready on the line, the following is the optimization process before the line on the website:
found that robots rules are not the problem, and then view the program generated robots.txt file, and set the background into the webmaster view to grab, as like as two peas, in the "removal tool permissions" column to see: the first line:? User-agent:*, then the result is a syntax error. It was more of a "?", really do not know what is rather baffling, problem, no way, can only use the trick, by simulating the search engine to crawl the site, to see what is going wrong, really see the problem, because the txt file encoding is not correct by writing the file using the the utf8 encoding, but it looks like the ro> search engine
I put ROBOTS.TXT back, allows search engines to crawl content, then 1, 2 original articles, and then go to the major search engine submission landing site, then wait until the evening to go to A5 special sections attract spiders, 11:00 to 00:30 in the evening time, publish original text, add your own website in the above link, quickly attracted the spider crawling my site. Finally, I want to sit by, most tomorrow and site will be included, the results of the accident, third days are not included, feel very strange, in this period, I have to update the content and some of the chain, is supposed to have included. I downloaded the FTP log, found on-line night 12 points when the spider came, but went to climb, when the robots.txt file is very puzzled, reckon robots.txt file should be wrong, just look at the open, here is some we see the robots.txt file:
first of all I want to do is a regular warfare, so I’ve been using the regular operation, on-line before using the ROBOTS.TXT file to shield the spider crawling, put the site layout, the original JS code and some extra junk code to delete. And then write the site title, confirmed there will be a pile of words of the suspects, then in each section are 4, 5 of the original or false original article, filling the website, such as online search engines don’t think this website is not what content to grab. Site title, content, structure, layout of the website can fix the line.