object SplashMain
Linear Supertypes
Ordering
- Alphabetic
- By Inheritance
Inherited
- SplashMain
- AnyRef
- Any
- Hide All
- Show All
Visibility
- Public
- All
Type Members
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def main(args: Array[String]): Unit
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
This is the documentation for the
splish
library.Package Information
The splish package contains a single class splish.Splash with methods for interacting with a ScrapingHub Splash instance.
If you haven’t hit the above link yet and are unfamilar with Splash, the TLDR is that it’s an alternative to Selenium in that it’s a full browser and executes javascript. The full rendering engine is based on Qt Webkit and Splash instances have a REST API that provides a ton of flexibility when needed and ease of use for more casual scraping tasks.
You can get it up and running locally with Docker via:
If you've built the source and run
sbt packInstall
, you can start playing withsplash
on the command line via~/local/bin/splash-main
. Here's the help:The first thing we need to do is make a connection to the server
We can test that connection and get some other information as well:
The library makes use of [
uJson
](http://www.lihaoyi.com/upickle/#uJson) for more complex return types and a few methods return arequests
[Response
](https://github.com/lihaoyi/requests-scala/blob/master/requests/src/requests/Model.scala#L235-L276) object due to the result of a call to more dynamic endpoints being un-knowable at call time (Splash allows you to useLua
to perform complex page interaction and you can return images, plaintext, HTML or JSON via the Lua interface).The classic use case for Splash is to feed it a URL and get HTML back after it’s had time to process any javascript. The URL in the following example relies on javascript to add content to the page:
Here’s what that looks like just using the
requests
library:Most of the other Splash API endpoints have corresponding methods in the library (the image-oriented ones are on the TODO list). We can get the same page in both Splash JSON:
and HAR formats: