[devdocsgjs/main: 1/64] Add docs for process_response? and parse.




commit 14a8c9bc7cbebe287ef0b719b818e7c5a247cc31
Author: xjkdev <xjk2008 gmail com>
Date:   Tue Mar 15 00:48:05 2022 +0800

    Add docs for process_response? and parse.

 docs/scraper-reference.md | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
---
diff --git a/docs/scraper-reference.md b/docs/scraper-reference.md
index 88bbdf04..d5ed5074 100644
--- a/docs/scraper-reference.md
+++ b/docs/scraper-reference.md
@@ -187,6 +187,30 @@ More information about how filters work is available on the [Filter Reference](.
 
   _Note: this filter is disabled by default._
 
+### Processing responses before filters
+
+These methods are runned before filter stacks, and can directly process responses.
+
+* `process_response?(response)`
+
+  Determine whether a response should be processed. A response will be dropped if this method returns 
`false`.
+
+  It is useful to filter pages, such as empty, invalid, or redirecting pages, depending on the content.
+
+  Example: [lib/docs/scrapers/kotlin.rb](../lib/docs/scrapers/kotlin.rb)
+
+
+* `parse(response)`
+
+  Parse HTTP/File response, and convert to a Nokogiri document by default.
+
+  Overrides this method if you want to modified HTML source code before Nokogiri.
+It is useful to preserve whitespaces of code segments within non-pre blocks, because Nokogiri may delete 
them.
+
+  Example: [lib/docs/scrapers/go.rb](../lib/docs/scrapers/go.rb)
+
+
+
 ## Keeping scrapers up-to-date
 
 In order to keep scrapers up-to-date the `get_latest_version(opts)` method should be overridden. If 
`self.release` is defined, this should return the latest version of the documentation. If `self.release` is 
not defined, it should return the Epoch time when the documentation was last modified. If the documentation 
will never change, simply return `1.0.0`. The result of this method is periodically reported in a 
"Documentation versions report" issue which helps maintainers keep track of outdated documentations.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]