Machine Learning for Expense Note Capture: MSc Project Report

Verified

Added on  2022/11/14

|19
|5905
|72
Project
AI Summary
This project report details the design and implementation of a mobile application for automated expense note capture using machine learning. The application utilizes the phone's camera to scan receipts, extract relevant information such as date, total amount, currency, and receipt type, and then allows for modification and upload to a backend system. The project focuses on the core technologies, including machine learning algorithms for image processing and optical character recognition (OCR), the C# programming language, and the Xamarin or Android platform for mobile development. The report includes a literature review of relevant machine learning techniques, a technological analysis comparing different approaches, and a detailed methodology outlining the development process. The aim is to streamline expense tracking by automating the data entry process, reducing manual input, and improving efficiency. The project also discusses the use of Microsoft Azure's LUIS (Language Understanding) service and API integration for enhanced functionality and data management.
Document Page
Running head: MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
Name of student
Name of university
Author’s note:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
Table of Contents
Introduction....................................................................................................................2
Literature review............................................................................................................2
Technological analysis.................................................................................................10
Methodology................................................................................................................12
References....................................................................................................................15
Document Page
2
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
Introduction
The recent developments in the technology of deep learning have made the tasks like
the speech and image recognition significantly possible. This particular technology
significantly excels in the recognition of the objects in any image and it is then implemented
with the utilisation of 3 or more layers of the artificial neural networks in which every layer is
mainly responsible for the extraction of the one or more than one feature of any image.
The images have played the most significant role in the life of humans as the vision is
mainly most important sense of the human beings. As the major consequence, the sector of
the enhanced image process has several applications. In the present times, the images are
discovered almost everywhere and it is significantly easy to produce huge number of images
with the effective utilisation of advanced technologies. A system for expense note capture on
the cloud is being designed. This thesis intends to discuss the use of machine learning for the
image processing as well as image recognition. A brief literature review of the machine
learning used for image processing has been provided in this report.
Literature review
In the present technological world, where there is significant domination of
technology and information along with the graphical development, the images plays the
increasingly crucial role in several aspects (Sonka, Hlavac and Boyle 2014). The image
processing could be described as the sector of science, agriculture, technology as well as the
biological image processing. With respect to this, it is significantly easy for all the individuals
to produce the complex graphical images with the help of digital technologies. With such a
collection of the picture, the traditional strategies of picture preparing are required to adapt
increasingly to several issues and then need to efficiently confront the flexibility as per the
human vision (Arganda-Carreras et al. 2017). The conventional image processing has
Document Page
3
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
presently faced several issues with significant pitfalls. With the introduction of the image
datasets as well as the benchmarks, the image processing and machine learning has presently
gained immense recognition.
The ingenious integration of the machine learning technology in the image processing
could be significantly persuasive within this particular field, that would easily lead to the
improve conception regarding the images (Tsaftaris, Minervini and Scharr 2016). This
particular mainly utilise the inductive learning calculation for creating the generation rules
from preparing the information. The amount of the calculations of the image processing that
combines some of the learning parts is needed to increment as there is a requirement of
adjustment. In the aspect of reality, making significant amount of images denotes having the
required capacity of preparing significant amount of information constantly of the high
measurements. Machine learning has been commonly considered as the data science method
that permits the computers with using the prevailing data for forecasting the future
behaviours, the trends as well as the outcomes (Van den Oord et al. 2016). With the proper
utilisation of machine learning tools, the computers learn efficiently without any explicitly
programmed. The forecasts or the predictions from the machine learning tools could make the
apps as well as the devices significantly smarter.
Around 2012, it was researched by several researchers who discussed regarding the
traffic signs that have been portrayed by the vast fluctuation in the visual appearance in the
true situations. Practically considering, numerous signs classes are required to be perceived
with the increased precision (Van der Walt et al. 2016). The proper utilisation of the neural
networks are extensively explored and then several algorithms of image processing have been
deployyed for the humans for executing the tasks with significant ease. Regardless, none of
these systems of machine learning could deal with the input images of the variable size as
well as the point extent as it is present within the dataset (Yi et al. 2017). The most standard
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
approach is the scaling of all images to the significantly settled size. This could help in
understanding the issues when outlook measure is significantly distinctive among the first as
well as the target size. Besides this, it performs the disposal of the data in the bigger pictures
or it helps in presenting the antiquities with the assumption of the firm amplification of the
significantly little pictures (Greenspan, Van Ginneken and Summers 2016).
The use of data analytics is now commonly used by several companies. The analysis
of scanned documents introduces significantly additional challenges for the quality of
scanning. The machine learning has been used for significant amount of time for the
automatic comprehending the sale records that allows the access for essential as well as
accurate statistics of consumption (Wan et al. 2014). when any image has been acquired
using any smart phone, several full tool chain has been introduced by the researchers that
mainly aims at offering the required information like the store brand, the purchased products
as well as the associated prices with the highest confidence. The tool chain with double check
processing utilising the deep convolutional neural networks are commonly used along with
the classical image and the text processing for efficient understanding of the images (Tang et
al. 2014).
In the sector of the mass distribution, the understanding of the behaviours of the
customers is considered as the crucial data that are searched by several companies. Indeed,
these kinds of information possess significantly increased added value because it offers the
appropriate statistics of consumption (Toulouse et al. 2016). The statistics are the crucial
input for any dissimilar studies that aims in developing the effective strategies of sales.
Presently, these data is manually gained by efficiently the recruiting referred as the panellists
who are then asked with scanning of the purchased products and then fill out the forms.
These kinds of solution has been considered as significantly costly and therefore, it could not
be applied on significantly large populations that restricts the statistical value as well as the
Document Page
5
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
significance. The effective reading of any sale receipt for any special case like the discount
coupon awarding the validation (Yamamoto et al. 2014). Subsequently, the automatic
understanding of the sale receipts is presently considered as challenging. Initially, all the
information retrieval from these kinds of document is not considered as easy as often the
receipts are extensively damaged prior being scanned and due to the textual context consists
of the non-standardised terminology. There have been several works related to the receipt
capture using the methods of machine learning (Szegedy et al. 2017).
However, there are minimal amount of works that deals with the analysis of sale
receipt from any picture gained from any smartphone. The common methods depends on the
accurate image acquisition that allows the character recognition (Bijalwan et al. 2014). The
researchers have deduced that with the implementation of the machine learning in receipt
capture, it includes the three major procedures, which are the object recognition, the character
detection and the semantic analysis, with the significantly focus on the object detection that is
the core activity (Zhang et al. 2017). The first collection of the traditional works depends on
the local feature detection, the description as well as the matching methods utilising the
features that are engineered like the HOG and the SIFT. It has been proposed by several
researchers the most appropriate solution of localising the semi-structured documents like the
tickets of the ID cards. Moreover, in situation of the case of the document images majorly
comprising of the texts, the interest point matching is significantly disturbed by the repetitive
character configurations (Ledig et al. 2017).
Hence this approach has been introduced with the connected key point selection
technique as well as the particular implementation of RANSAC algorithm that has been
created as significantly strong against any character redundancy (Ding et al. 2015). But the
proposed strategy by the researchers does not efficiently satisfies the high accuracy extraction
constraint of the information in the situation of any noisy images. With the utilisation of the
Document Page
6
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
classical approach, the set of works helps in localising the text areas within any image with
the application of the region detection as well as the classification (Cao et al. 2016). These
particular approaches, majorly dedicated for the analysis of complex natural scene have been
considered as significantly complex for the problem. The engineered characteristics are vastly
utilised for the logo identification and it has been proved to be increasingly efficient.
Moreover, the often presence of the characters in the store logos causes the failure of some of
these approaches (Madabhushi and Lee 2016).
OCR could be formally referred as optical character recognition. This technology is
mainly used where the data as well as the information from all the files are extracted and then
stored in the electronic formats. The software for image processing for the improved OCR
results utilises the simple sequence for producing the high quality content as well as images
(Singh et al. 2016).
The state of the art presently emphasised the second classification of efficient
methodologies on the basis of deep learning that has been observed to be outperforming all of
the previous approaches on the tasks of object detection in the images. This empowering
factor is moreover the accessibility of the huge annotated datasets for executing the effective
optimisation of huge amount of parameters (Chen et al. 2014). The most common solution is
the fine-tuning of the pretrained networks as it has been proposed by the previous
researchers. With regards to the text block detection, the machine learning based methods
displays significantly efficient results like the effective relying on the multi-task network plus
permitting the words recognising as well as detection in the uncontrolled scenes. In other
researches, the authors have proposed the approach for the text line localisation on the basis
of Convolutional Neural Networks as well as the Multidimensional Long Short Term
Memory cells. In the contrast, the text is processed within any particular region or the text
blocks as these presents the various semantic contents. Regarding the efficient logo detection,
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
majority of the present works have efficiently dealt with the classification and the localisation
of the logo with the utilisation of machine learning. For example, majority of the proposed
methods by the researchers does the selection of the candidates subwindows with the
utilisation of the unsupervised segmentation algorithm as well as SVM based classification of
these regions of candidates with the characteristics computer by the pretrained Convolutional
Neural Network. The researchers also utilised the pre-trained deep networks to execute the
logo recognition. The gained results are significantly interesting and it has been considered
that as approach as logos are frequently solely the characters, the detection is required to
utilise the jointly character recognition (Yu and Zhang 2015).
Machine learning could be described as the classification of the algorithm that permits
the software applications in becoming the increasingly accurate in executing the prediction of
results without being programmed. The most common ground of the machine learning is the
building of the algorithms that could efficiently receive the input data and then utilise the
statistical analysis for the prediction of any output while upgrading the outputs as the new
data becomes accessible. The main procedures included in machine learning are significantly
identical to data mining as well as the predictive modelling (Hong et al. 2015). Both of these
models needs the searching with the help of data for searching the patterns and then adjusting
the program actions as per the gained output. Several organisations and developers are aware
with the machine learning from the aspect of shopping on the internet as well as being served
ads associated with the purchases. It occurs due to the recommendation engines that utilises
the machine learning for personalising the online ad delivery in the real time. Beyond the
personalised marketing, some of the other use cases of machine learning includes the fraud
recognition, the spam filtering, the security threat recognition of networks, developing the
news feeds as well as the predictive preservation.
Document Page
8
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
The algorithms of machine learning are frequently categorised as the unsupervised or
even supervised. The supervised algorithms are the algorithms that needs any data scientist or
even any data analyst with the appropriate skills of machine learning for providing both the
input and then gain the desired output of the computation along with the furnishing feedback
regarding the accuracy of the predictions during the training of the algorithms. The data
scientists mainly determines the appropriate variables or the features as well as the model that
is required to be analysed and then used for developing the predictions
(Razzak, Naz and Zaib 2018). The unsupervised algorithms does not require any training
along with the desired outcome data. Rather, the iterative approach is used by them that is
referred as the deep learning for reviewing the data and then arriving at any conclusion. The
unsupervised learning algorithms referred as the neural networks are commonly utilised for
the complex tasks of processing than the supervised learning systems that includes the image
recognition, the speech to text as well as the generation of natural language.
The most common machine learning tool that is used primarily for designing the
application of image processing or even bill detecting is the Microsoft Azure machine
learning. The service of Azure machine learning offers the cloud based working environment
that could be used by the developers for preparing the data, deploy, test, manage as well as
track the models of machine learning. This service completely supports the open source
technologies like the PyTouch, TensorFlow, as well as the scikit-learn and it could also be
utilised for several kinds of machine learning, that ranges from conventional machine
learning to the deep learning, unsupervised or even supervised learning. The Microsft Azure
Machine learning could be described as the collection of the tools and services intended with
helping the developers train and then deploy the models of machine learning. The Microsoft
company provides the services and tools with the help of Azure public cloud. The suite of
Microsoft Azure Machine Learning involves the array of tools as well as the services like:
Document Page
9
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
Azure Machine learning Workbench: Workbench could be described as the end-user
Windows/ MacOS application that manages the primary tasks for any project of machine
learning that includes the data preparation and import, the model development, the
experimental development and the model development in the multiple environments. The
Workbench mainly interoperates with the major third-party tools that includes the Git for the
version control as well as the Jupyter Notebook for the data transformation and cleaning, the
statistical modelling and the data visualisation.
Azure Machine Learning Experimentation Service: This service mainly interoperates
with the Workbench for providing the project management, the access control as well as the
version control with the Git. It assists the support for the execution of the experiments of
machine learning for building and training models. The experimentation also emphasises on
construction of the virtualised environments that allows the developers to efficiently isolate
and then operate the models along with the recording of the details of every run for aiding in
the model development. The experimentation could execute the locally deployment of the
models, within any local Docker container, any Docker container within the remote virtual
machine as well as with the scale out Spark cluster executing in the Azure.
Azure Machine Learning Model Management: This particular service assists the
developers in tracking as well as managing the versions of the models, store as well as
register the models, process the models as well as the dependencies into any Docker image
files, then register the images in respective Docker registry in Azure along with efficient
deploying of the container images into the vast assortment of any computing environment
that includes the IoT edge devices.
Microsoft Machine Learning Libraries for Apache Spark: The MML spark offers the
series of techniques and methods that helps in effective integration of the Spark pipelines
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
10
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
with the associated tools of machine learning that includes the Microsoft Cognitive Toolkit as
well as the OpenCV library. These particular libraries helps in providing the required
acceleration to the development of the models of machine learning that includes the text and
image data.
Technological analysis
There are several tools and techniques that are used for developing the expense note
capture. Some of the tools that are used are:
Visual studio: The visual studio could be described as the complete set of the
development tool that is used for Windows application, the web application as well as the
mobile application. Visual C++, Visual Basic, Visual F#, Visual C#, as well as several other
languages are extensively supported in the Visual studio. The developers or the programmers
could easily develop any software with the proper utilisation of the Visual studio. It is
significantly user friendly. The visual studio is the set of the class libraries. It has been
developed with the help of Oops concept. In this application designing, the visual studio is
used for developing the software that would be used for capturing the data from the receipts.
Xamarin: When the consideration of the process by which the applications would be
developed on the iOS and the android platform would be done, majority of the people
perceives that the native languages are the main source of developing the applications.
Xamarin permits the developing platform in the C# language with the class library as well as
the runtime that efficiently works across all the platforms that includes iOS, Windows as well
as Android while compiling the native applications, which are significantly performant
sufficiently even for the demanding ones. Xamarin includes the bindings for almost complete
underlying platform SDKs in both the Android as well as iOS. Addition to this, the bindings
are significantly strong typer that means that these applications are significantly easy to
Document Page
11
MACHINE LEARNING FOR EXPENSE NOTE CAPTURE
navigate as well as utilise and it provides the strong compile time type checking as well as
during the development.
Microsoft Azure: The Microsoft Azure could be described as the service of cloud
computing that has been created by Microsoft to build, test, deploy, as well as manage the
application and the service with the help of Microsoft Managed data centers. It offers the
software as a service, the infrastructure as a service, as well as the platform as a service as
well as supports several programming languages, frameworks and tools, along with both the
Microsoft-specific as well as third party systems as well as software. The virtual machines or
the Infrastructure as a service provides the users with the ability of launching the general
purpose Microsoft Windows as well as the Linux virtual machines and the preconfigured
machines images for any popular packages of software.
Microsoft SQL Server: The Microsoft SQL server could be described as the relational
database management system, which provides the required support to the vast variation of
transaction processing, the business intelligence as well as the analytics application in the
corporate IT environments. The Microsoft SQL server is among the leading market database
technologies, with the Oracle Database and the DB2 of IBM. Just like any other RDBMS
software, the Microsoft SQL server has been created on the top of SQL, which the
standardised programming language that is used by the database administrators and the IT
professionals for the managing the databases and then query the containing data. Just like any
other technologies of RDBMS, the SQL server is mainly developed around the row based
table structure that helps in connecting the associated data elements in the various tables
among one another, with the avoiding of the requirement of redundantly storing the data in
several places in any database.
chevron_up_icon
1 out of 19
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]