apkutils package

Submodules

apkutils.apkfile module

from https://github.com/python/cpython/tree/3.6/Lib/zipfile.py

Read and write APK files.

XXX references to utf-8 need further investigation.

exception apkutils.apkfile.BadZipFile[source]

Bases: Exception

apkutils.apkfile.BadZipfile

alias of apkutils.apkfile.BadZipFile

exception apkutils.apkfile.LargeZipFile[source]

Bases: Exception

Raised when writing a zipfile, the zipfile requires ZIP64 extensions and those extensions are disabled.

class apkutils.apkfile.PyZipFile(file, mode='r', compression=0, allowZip64=True, optimize=- 1)[source]

Bases: apkutils.apkfile.ZipFile

Class to create ZIP archives with Python library files and packages.

writepy(pathname, basename='', filterfunc=None)[source]

Add all files from “pathname” to the ZIP archive.

If pathname is a package directory, search the directory and all package subdirectories recursively for all *.py and enter the modules into the archive. If pathname is a plain directory, listdir *.py and enter all modules. Else, pathname must be a Python *.py file and the module will be put into the archive. Added modules are always module.pyc. This method will compile the module.py into module.pyc if necessary. If filterfunc(pathname) is given, it is called with every argument. When it is False, the file or directory is skipped.

class apkutils.apkfile.ZipFile(file, mode='r', compression=0, allowZip64=True)[source]

Bases: object

Class with methods to open, read, write, close, list zip files.

z = ZipFile(file, mode=”r”, compression=ZIP_STORED, allowZip64=True)

file: Either the path to the file, or a file-like object.

If it is a path, the file will be opened and closed by ZipFile.

mode: The mode can be either read ‘r’, write ‘w’, exclusive create ‘x’,

or append ‘a’.

compression: ZIP_STORED (no compression), ZIP_DEFLATED (requires zlib),

ZIP_BZIP2 (requires bz2) or ZIP_LZMA (requires lzma).

allowZip64: if True ZipFile will create files with ZIP64 extensions when

needed, otherwise it will raise an exception when this would be necessary.

close()[source]

Close the file, and for mode ‘w’, ‘x’ and ‘a’ write the ending records.

property comment

The comment text associated with the ZIP file.

extract(member, path=None, pwd=None)[source]

Extract a member from the archive to the current working directory, using its full name. Its file information is extracted as accurately as possible. `member’ may be a filename or a ZipInfo object. You can specify a different directory using `path’.

extractall(path=None, members=None, pwd=None)[source]

Extract all members from the archive to the current working directory. `path’ specifies a different directory to extract to. `members’ is optional and must be a subset of the list returned by namelist().

fp = None
getinfo(name)[source]

Return the instance of ZipInfo given ‘name’.

infolist()[source]

Return a list of class ZipInfo instances for files in the archive.

namelist()[source]

Return a list of file names in the archive.

open(name, mode='r', pwd=None)[source]

Return file-like object for ‘name’.

printdir(file=None)[source]

Print a table of contents for the zip file.

read(name, pwd=None)[source]

Return file bytes (as a string) for name.

setpassword(pwd)[source]

Set default password for encrypted files.

testzip()[source]

Read all the files and check the CRC.

write(filename, arcname=None, compress_type=None)[source]

Put the bytes from filename into the archive under the name arcname.

writestr(zinfo_or_arcname, data, compress_type=None)[source]

Write a file into the archive. The contents is ‘data’, which may be either a ‘str’ or a ‘bytes’ instance; if it is a ‘str’, it is encoded as UTF-8 first. ‘zinfo_or_arcname’ is either a ZipInfo instance or the name of the file in the archive.

class apkutils.apkfile.ZipInfo(filename='NoName', date_time=(1980, 1, 1, 0, 0, 0))[source]

Bases: object

Class with attributes describing each file in the ZIP archive.

CRC
FileHeader(zip64=None)[source]

Return the per-file header as a string.

comment
compress_size
compress_type
create_system
create_version
date_time
external_attr
extra
extract_version
file_size
filename
flag_bits
header_offset
internal_attr
orig_filename
reserved
volume
apkutils.apkfile.error

alias of apkutils.apkfile.BadZipFile

apkutils.apkfile.is_zipfile(filename)[source]

Quickly see if a file is a ZIP file by checking the magic number.

The filename argument may be a file or file-like object too.

apkutils.cert module

class apkutils.cert.Certificate(buff, digestalgo='md5')[source]

Bases: object

get()[source]

apkutils.cli module

Console script for apkutils.

apkutils.gdiff module

Diff Match and Patch Copyright 2018 The diff-match-patch Authors. https://github.com/google/diff-match-patch

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class apkutils.gdiff.diff_match_patch[source]

Bases: object

Class containing the diff, match and patch methods.

Also contains the behaviour settings.

BLANKLINEEND = re.compile('\\n\\r?\\n$')
BLANKLINESTART = re.compile('^\\r?\\n\\r?\\n')
DIFF_DELETE = -1
DIFF_EQUAL = 0
DIFF_INSERT = 1
diff_bisect(text1, text2, deadline)[source]
Find the ‘middle snake’ of a diff, split the problem in two

and return the recursively constructed diff. See Myers 1986 paper: An O(ND) Difference Algorithm and Its Variations.

Args:

text1: Old string to be diffed. text2: New string to be diffed. deadline: Time at which to bail if not yet complete.

Returns:

Array of diff tuples.

diff_bisectSplit(text1, text2, x, y, deadline)[source]

Given the location of the ‘middle snake’, split the diff in two parts and recurse.

Args:

text1: Old string to be diffed. text2: New string to be diffed. x: Index of split point in text1. y: Index of split point in text2. deadline: Time at which to bail if not yet complete.

Returns:

Array of diff tuples.

diff_charsToLines(diffs, lineArray)[source]

Rehydrate the text in a diff from a string of line hashes to real lines of text.

Args:

diffs: Array of diff tuples. lineArray: Array of unique strings.

diff_cleanupEfficiency(diffs)[source]

Reduce the number of edits by eliminating operationally trivial equalities.

Args:

diffs: Array of diff tuples.

diff_cleanupMerge(diffs)[source]

Reorder and merge like edit sections. Merge equalities. Any edit section can move as long as it doesn’t cross an equality.

Args:

diffs: Array of diff tuples.

diff_cleanupSemantic(diffs)[source]

Reduce the number of edits by eliminating semantically trivial equalities.

Args:

diffs: Array of diff tuples.

diff_cleanupSemanticLossless(diffs)[source]

Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary. e.g: The c<ins>at c</ins>ame. -> The <ins>cat </ins>came.

Args:

diffs: Array of diff tuples.

diff_commonOverlap(text1, text2)[source]

Determine if the suffix of one string is the prefix of another.

Args:

text1 First string. text2 Second string.

Returns:

The number of characters common to the end of the first string and the start of the second string.

diff_commonPrefix(text1, text2)[source]

Determine the common prefix of two strings.

Args:

text1: First string. text2: Second string.

Returns:

The number of characters common to the start of each string.

diff_commonSuffix(text1, text2)[source]

Determine the common suffix of two strings.

Args:

text1: First string. text2: Second string.

Returns:

The number of characters common to the end of each string.

diff_compute(text1, text2, checklines, deadline)[source]
Find the differences between two texts. Assumes that the texts do not

have any common prefix or suffix.

Args:

text1: Old string to be diffed. text2: New string to be diffed. checklines: Speedup flag. If false, then don’t run a line-level diff

first to identify the changed areas. If true, then run a faster, slightly less optimal diff.

deadline: Time when the diff should be complete by.

Returns:

Array of changes.

diff_fromDelta(text1, delta)[source]

Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.

Args:

text1: Source string for the diff. delta: Delta text.

Returns:

Array of diff tuples.

Raises:

ValueError: If invalid input.

diff_halfMatch(text1, text2)[source]

Do the two texts share a substring which is at least half the length of the longer text? This speedup can produce non-minimal diffs.

Args:

text1: First string. text2: Second string.

Returns:

Five element Array, containing the prefix of text1, the suffix of text1, the prefix of text2, the suffix of text2 and the common middle. Or None if there was no match.

diff_levenshtein(diffs)[source]

Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.

Args:

diffs: Array of diff tuples.

Returns:

Number of changes.

diff_lineMode(text1, text2, deadline)[source]
Do a quick line-level diff on both strings, then rediff the parts for

greater accuracy. This speedup can produce non-minimal diffs.

Args:

text1: Old string to be diffed. text2: New string to be diffed. deadline: Time when the diff should be complete by.

Returns:

Array of changes.

diff_linesToChars(text1, text2)[source]

Split two texts into an array of strings. Reduce the texts to a string of hashes where each Unicode character represents one line.

Args:

text1: First string. text2: Second string.

Returns:

Three element tuple, containing the encoded text1, the encoded text2 and the array of unique strings. The zeroth element of the array of unique strings is intentionally blank.

diff_main(text1, text2, checklines=True, deadline=None)[source]
Find the differences between two texts. Simplifies the problem by

stripping any common prefix or suffix off the texts before diffing.

Args:

text1: Old string to be diffed. text2: New string to be diffed. checklines: Optional speedup flag. If present and false, then don’t run

a line-level diff first to identify the changed areas. Defaults to true, which does a faster, slightly less optimal diff.

deadline: Optional time when the diff should be complete by. Used

internally for recursive calls. Users should set DiffTimeout instead.

Returns:

Array of changes.

diff_prettyHtml(diffs)[source]

Convert a diff array into a pretty HTML report.

Args:

diffs: Array of diff tuples.

Returns:

HTML representation.

diff_text1(diffs)[source]

Compute and return the source text (all equalities and deletions).

Args:

diffs: Array of diff tuples.

Returns:

Source text.

diff_text2(diffs)[source]

Compute and return the destination text (all equalities and insertions).

Args:

diffs: Array of diff tuples.

Returns:

Destination text.

diff_toDelta(diffs)[source]

Crush the diff into an encoded string which describes the operations required to transform text1 into text2. E.g. =3 -2 +ing -> Keep 3 chars, delete 2 chars, insert ‘ing’. Operations are tab-separated. Inserted text is escaped using %xx notation.

Args:

diffs: Array of diff tuples.

Returns:

Delta text.

diff_xIndex(diffs, loc)[source]

loc is a location in text1, compute and return the equivalent location in text2. e.g. “The cat” vs “The big cat”, 1->1, 5->8

Args:

diffs: Array of diff tuples. loc: Location within text1.

Returns:

Location within text2.

match_alphabet(pattern)[source]

Initialise the alphabet for the Bitap algorithm.

Args:

pattern: The text to encode.

Returns:

Hash of character locations.

match_bitap(text, pattern, loc)[source]

Locate the best instance of ‘pattern’ in ‘text’ near ‘loc’ using the Bitap algorithm.

Args:

text: The text to search. pattern: The pattern to search for. loc: The location to search around.

Returns:

Best match index or -1.

match_main(text, pattern, loc)[source]

Locate the best instance of ‘pattern’ in ‘text’ near ‘loc’.

Args:

text: The text to search. pattern: The pattern to search for. loc: The location to search around.

Returns:

Best match index or -1.

patch_addContext(patch, text)[source]

Increase the context until it is unique, but don’t let the pattern expand beyond Match_MaxBits.

Args:

patch: The patch to grow. text: Source text.

patch_addPadding(patches)[source]

Add some padding on text start and end so that edges can match something. Intended to be called only from within patch_apply.

Args:

patches: Array of Patch objects.

Returns:

The padding string added to each side.

patch_apply(patches, text)[source]

Merge a set of patches onto the text. Return a patched text, as well as a list of true/false values indicating which patches were applied.

Args:

patches: Array of Patch objects. text: Old text.

Returns:

Two element Array, containing the new text and an array of boolean values.

patch_deepCopy(patches)[source]

Given an array of patches, return another array that is identical.

Args:

patches: Array of Patch objects.

Returns:

Array of Patch objects.

patch_fromText(textline)[source]

Parse a textual representation of patches and return a list of patch objects.

Args:

textline: Text representation of patches.

Returns:

Array of Patch objects.

Raises:

ValueError: If invalid input.

patch_make(a, b=None, c=None)[source]

Compute a list of patches to turn text1 into text2. Use diffs if provided, otherwise compute it ourselves. There are four ways to call this function, depending on what data is available to the caller: Method 1: a = text1, b = text2 Method 2: a = diffs Method 3 (optimal): a = text1, b = diffs Method 4 (deprecated, use method 3): a = text1, b = text2, c = diffs

Args:
a: text1 (methods 1,3,4) or Array of diff tuples for text1 to

text2 (method 2).

b: text2 (methods 1,4) or Array of diff tuples for text1 to

text2 (method 3) or undefined (method 2).

c: Array of diff tuples for text1 to text2 (method 4) or

undefined (methods 1,2,3).

Returns:

Array of Patch objects.

patch_splitMax(patches)[source]

Look through the patches and break up any which are longer than the maximum limit of the match algorithm. Intended to be called only from within patch_apply.

Args:

patches: Array of Patch objects.

patch_toText(patches)[source]

Take a list of patches and return a textual representation.

Args:

patches: Array of Patch objects.

Returns:

Text representation of patches.

class apkutils.gdiff.patch_obj[source]

Bases: object

Class representing one patch operation.

apkutils.intersection module

class apkutils.intersection.APK_Intersection(apks)[source]

Bases: object

common(one, two)[source]

清单内容交集,不一样的地方用*号表示。 注:只是简单的匹配,可能不如人意。 Args:

one (TYPE): 第一个清单 two (TYPE): 第二个清单

Returns:

TYPE: 清单交集

static gen_words(s)[source]
get_actions(mani)[source]
get_permissions(mani)[source]
intersect_apis()[source]
intersect_arsc()[source]
intersect_certs()[source]
intersect_dex_apis()[source]

api字符串交集

真正的字符串不包含类名、方法名。 特征方法中定义的、使用的字符串。

intersect_dex_opcode(is_wildcard, is_obj)[source]

[summary]

Args:

is_wildcard (bool): 是否通配 is_obj (bool): 父类是否为Object

Returns:

[type]: [description]

intersect_dex_string()[source]
intersect_dex_string_refx(filters)[source]

字符串交集

真正的字符串,不包含类名、方法命。 特征方法中定义的、使用的字符串。

intersect_dex_tree()[source]
intersect_files()[source]
intersect_manifest()[source]

清单交集

Returns:

TYPE: 清单内容交集

intersect_manifest_tag_num()[source]
intersect_manifest_text()[source]
intersect_mf()[source]
static process_mani(mani)[source]

apkutils.wildcard module

apkutils.wildcard.find_common_opcodes(s1, s2)[source]
apkutils.wildcard.find_common_patterns(s1, s2)[source]
apkutils.wildcard.gen_wildcard_str(str1, str2, min_length=0)[source]

get commom opcode

apkutils.wildcard.get_best_wildcard_from_list(str1, str_list, min_length=0)[source]

从列表str_list中,找出一个与str最相似的通配字符串。

apkutils.wildcard.get_max_len(wildcards)[source]

find the max length from wildcards.

apkutils.wildcard.get_ratio(str1, str2, weight=3)[source]
apkutils.wildcard.get_wildcards(str1, str2, min_length=0)[source]

获取2个字符串的通配符字符串, length,2个*之间的字符串的最小长度,默认为0。 如果小于这个长度,那么会变成*;如果min_length=1,a -> *

apkutils.wildcard.get_wildcards_in_list(str_list, min_length=0)[source]

获取一个通配字符串,可以通配符该列表里面所有的字符串。

apkutils.wildcard.longest_common_subopcode(s1, s2)[source]

如果是2个普通串还好,但是,如果里面包含*,这种符号,那就完蛋了

apkutils.wildcard.longest_common_substring(s1, s2)[source]

Module contents

Top-level package for apkutils.

class apkutils.APK(apk_path)[source]

Bases: object

get_app_icon()[source]
get_application()[source]
get_arsc()[source]
get_certs(digestalgo='md5')[source]
get_classes()[source]
get_dex_files()[source]
get_files()[source]
get_main_activities()[source]
get_main_activity()[source]
get_manifest()[source]
get_manifest_tag_numbers()[source]

统计清单标签的个数

get_methods(limit=10000)[source]

获取所有方法路径 com/a/b/mtd_name

Returns:

TYPE: set

get_methods_refx()[source]

获取方法索引,即方法被那些类、方法使用了。

Returns

方法索引

Return type

[dict]

get_mini_mani()[source]
get_opcodes()[source]
get_org_manifest()[source]
get_org_strings()[source]
static get_proto_string(return_type, param_types)[source]
get_strings()[source]
get_strings_refx()[source]

获取字符串索引,即字符串被那些类、方法使用了。

Returns

字符串索引

Return type

[dict]

get_trees(height=2, limit=5000)[source]
static pretty_print(node)[source]

漂亮地打印一个节点

Args:

node (TYPE): Description

static serialize_xml(org_xml)[source]