Friday, November 16, 2012

AndroidViewClient: Getting Browser's HTML page source

Standard Android Browser does not provide an option in its menu to view the HTML page source. Some workarounds like installing apps and then using Share page from Browser's menu has been described and also the use of Javascript and JQuery to add to a page has been detailed, but we are hungry for more. Needless to say, all these methods involve some manual step so I felt the need to find a completely automatic way of doing it.

Of course, to do it I would resort to our old pal AndroidViewClient. This is a very interesting example of its use because it lies far from testing and application or UI.

And so, without further ado, let me introduce you to the code...


#! /usr/bin/env monkeyrunner
'''
Copyright (C) 2012  Diego Torres Milano
Created on Oct 12, 2012

@author: diego
'''


import re
import sys
import os

# This must be imported before MonkeyRunner and MonkeyDevice,
# otherwise the import fails.
# PyDev sets PYTHONPATH, use it
try:
    for p in os.environ['PYTHONPATH'].split(':'):
        if not p in sys.path:
            sys.path.append(p)
except:
    pass

try:
    sys.path.append(os.path.join(os.environ['ANDROID_VIEW_CLIENT_HOME'], 'src'))
except:
    pass

from com.dtmilano.android.viewclient import ViewClient

from com.android.monkeyrunner import MonkeyRunner, MonkeyDevice

VPS = "javascript:alert(document.getElementsByTagName('html')[0].innerHTML);"
PACKAGE = 'com.android.browser'
ACTIVITY = '.BrowserActivity'
COMPONENT = PACKAGE + "/" + ACTIVITY
URI = 'http://dtmilano.blogspot.com'


device, serialno = ViewClient.connectToDeviceOrExit()

device.startActivity(component=COMPONENT, uri=URI)
MonkeyRunner.sleep(3)

vc = ViewClient(device=device, serialno=serialno)

device.drag((240, 180), (240, 420), 10, 10)

url = vc.findViewByIdOrRaise('id/url')
url.touch()
MonkeyRunner.sleep(1)

device.press('KEYCODE_DEL', MonkeyDevice.DOWN_AND_UP)
for c in VPS:
    device.type(c)
device.press('KEYCODE_ENTER', MonkeyDevice.DOWN_AND_UP)
MonkeyRunner.sleep(3)

vc.dump()
print vc.findViewByIdOrRaise('id/message').getText().replace('\\n', "\n")

device.press('KEYCODE_BACK', MonkeyDevice.DOWN_AND_UP)



And now a brief explanation of the most important pieces of this script.

  1. Shebang, you know, to invoke monkeyrunner as the interpreter. I don't have to tell you more (if you are a poor Windows user you may have to invoke monkeyrunner from command line, I feel sad for you)
  2. Some comments and imports
  3. Read PYTHONPATH just in case you are using Eclipse and Pydev (this has been explained in this post)
  4. Then use ANDROID_VIEW_CLIENT_HOME environment variable to find AndroidViewClient in your system
  5. Some constants defined. VPS is the actual javascript to obtain the page source
  6. The standard way of connecting to the device or emulator in AndroidViewClient. This handles errors and timeout automatically solving many problems you find with bare monkeyrunner
  7. We start Browser
  8. Drag a bit to make the URL visible in case the page has scrolled
  9. Next, we find the View with ID id/url, which you know, contains the URL
  10. We touch to focus
  11. And type the javascript in VPS followed by ENTER
  12. By that time the alert dialog should be on screen so we take a new dump
  13. Now we find the View with ID id/message which contains the HTML and print it
  14. Finally, we press BACK to dismiss it
I hope you have enjoyed it as much as I did and this help you find new ways of using AndroidViewClient.

P.S. This script will be part of AndroidViewClient source code distribution examples

6 comments:

keenos said...
This comment has been removed by the author.
Dami Fernández said...

Hi Diego. First of all, I'd like to thank for your post, they're quite useful to me.
My question is: Why do I get the ViewNotFound exception for id/url.
I tried this code on an emulator with Android 2.3.3. I also tried clicking the "search button" before looking for id/url but it didn't work either. Could you help me? Thanks anyway.

Diego Torres Milano said...

@Dami Fernández,
The latest AndroidViewClient version (2.3.7) adds more support for API 10.
browser-view-page-source.py was also modified to cope with some differences and problems found in API 10 (for example 'drag' constantly fails).

In running this script I was much more lucky using an emulator with Android 2.3.3 Google APIs than Android 2.3.3 API 10.

Dami Fernández said...

Thanks, I'll do it. Anyway, in every example I run, I always get the same warning related to subprocess import:

C:\AndroidViewClient-master\AndroidViewClient\src\com\dtmilano\android\viewclient.py:23:
RuntimeWarning: Unable to deter
mine _shell_command for underlying os: nt
import subprocess

I know where this library is, since I have it linked in PYTHONPATH (which I think it's only for PyDev) as in 'path' and 'lib' environment variables. Please, have you got any clue about this problem? Thanks

PS: I'm on Windows 7 and my python version is 3.3

Hi Guys said...

@Diego, for this browser html page source i am getting following error.. can you tell any work around for this..

130214 18:39:32.578:S [MainThread] [com.android.monkeyrunner.MonkeyRunnerOptions] Script terminated due to an exception
130214 18:39:32.578:S [MainThread] [com.android.monkeyrunner.MonkeyRunnerOptions]Traceback (most recent call last):
File "C:\Users\koro\Downloads\AndroidViewClient-master\AndroidViewClient-master\AndroidViewClient\examples\browser-view-page-source.py", line 49, in
url = vc.findViewByIdOrRaise('id/url')
File "C:\Users\koro\Downloads\AndroidViewClient-master\AndroidViewClient-master\AndroidViewClient\src\com\dtmilano\android\viewclient.py", line 1560, in findViewByIdOrRaise
raise ViewNotFoundException("Couldn't find view with ID=%s in tree with root=%s" % (viewId, root))
com.dtmilano.android.viewclient.ViewNotFoundException: Couldn't find view with ID=id/url in tree with root=ROOT

Diego Torres Milano said...

@Hi Guys,
Be sure that you are using the latest (and maintained) version of the script available in AndroidViewClient examples: https://github.com/dtmilano/AndroidViewClient/blob/master/AndroidViewClient/examples/browser-view-page-source.py