Limited-time offer! Save up to 50% Off | Solutions starting at $6 each  

(PDF) Information Extraction -A Text Mining Approach

Added on - 17 Feb 2021

Trusted by 2+ million users,
1000+ happy students everyday
Showing pages 1 to 3 of 9 pages
Profiling Travellers' Mode Choice towards airport access (HKIA) – Introducing the Text MiningApproachIntroduction (Background, Motivation, Problem Identification, Expected Outcome, Significance)Situated in the Pearl River Delta, Hong Kong as a regional logistics hub, Asia’s top travel destination and aninternational centre, has drawn more than 58 millions of visitors 2018, generating nearly 300 billions perCapita Spending with an average stay of 3.2 nights (Hong Kong Tourism Board, 2018). Amongst all, 30millions travelled to Hong Kong through air transport and landed at the Hong Kong International Airport(Civil Aviation Department (CAD), 2018), while 4.6 millions of passengers has used HKIA’s cross-boundary land and sea transport (Airport Authority Hong Kong, 2018). Opened on 6thJuly 1998, the HongKong International Airport (HKIA) connects to over 200 destinations worldwide by more than a hundredairlines. The Airport Authority aims to enhance its capacity as a leading aviation hub to cater for thegrowing demand and serve as the key engine in facilitating economic growth With outstanding operationalperformance, HKIA currently ranked 5thas the world’s top 10 airports by Skyrax (Airport Authority HongKong, n.d.).To accommodate the massive arrivals, the egress links to city-centre, Mainland China and Macau are well-developed. Visitors are able to choose from a variety of transportation ranging from bus service, airportexpress, to taxi and on-demand transportation such as Uber. Airport bridges people from around the worldwith the aviation system and the other modes of transport in-city. Passengers’ mode choice is paramount toevaluate the efficiency of the airport transport system, as well as providing valuable insights for policymaking and system planning. Yet, there are limited studies on visitors preference in their airport accessmode choice to HKIA. Further to previous research done regarding airport access mode choice on differentairport such as Turkey (Gokasar & Gunay, 2017), Korea (Choo, You, & Lee, 2013), HKIA (Tam, Tam, &Lam, 2005), data are collected through conducting survey.The process of data collection from surveys is time consuming, from setting up questionnaires to engagingwith respondents. Yet, it is difficult to acquire respondents over a longer period of time (over a week), todetermine seasonal trends. Furthermore, the sample size is small, leading to cautious interpretation of themodelling results. The new era of data collection through the uprising technique of text mining extracts amuch larger pool than of surveys, across a longer period, with a shorter time. It also provides a different typeof insights compared to the traditional survey and modelling approach.Text mining, as a knowledge discovery technique, acquires increasing importance in this digitalized era.Information are readily available on medium such as forums, Facebook, Twitter, etc. The technique, as anextension of extracting logical patterns from structured database, comprises of multiple fields to generatedecision analytics from large data set through information retrieval, text analysis, natural languageprocessing, and information classification (Irfan, et al., 2015). It covers disciplines in statistics, linguisticsand machine learning; generally includes categorization of information, clustering of text, eaccess (HKIA) –Introducing the Text Mining Approach.Purpose of text mining approach is to process or transform unstructured data or information ( textual) intomeaningful numeric indices from text given so that to frame information that is available in accessible textthat is available in various mining forms. In general terms text mining is responsible for turning text intonumbers or meaningful indices which can further be used in in other forms of analysis or examination suchas predictive data mining projects , unsupervised learning methods etc. There are various approaches of textmining which have further been stated as under:Using well tested methods:In this process or approach, once a data matrix has been derived from inputdocuments it is important that well developed and well known analytical tools and techniques are used forfurther processing their data . This method can further infuse methods such as clustering, factoring orpredictive data mining.Black box approach:There are a number of text mining applications that involve black box method so thata deep meaning from documents can be extracted with involving a certain amount of human effort. In this
method text minnig mainly depends on proprietary algorithms that can be used for gaining concepts fromtext. This technology is expected to yet in its infant stage in current scenario.Text mining as document search:This is another important application that is often known as text mining.This approach occurs in a domain form. For example, popular internet search engines that are used byindividuals for providing efficient access to web pages that have important content. It is a quite importanttype of application software which is very beneficial for business entities that have to search data in quitelarger directory form. With the help of this, maximum benefits could be gained by business entities inpulling out right amount of information in specific time frame.Have done the intro and literature review; most need help on methodology and result part, ofc the textmining part too. Dont mind any changes on the scope/objective of the project, but mostly will be using datafrom TripAdvisor. Traction of concepts, and formulation of general taxonomies. Text mining help extractuseful information from bulk data efficiently in a short period of time, as well as assisting the prediction offuture aspect based on the provided observations and statistics generated from the concluded trends fromdata sets (Hashimi, Hafez, & Mathkour, 2015). Social media mining has been employed by many businessesto perform competitive analysis through transforming data into insights. In contrast with traditional dataanalytics, social media tools show the interactivity between users, which has become a crucial role inchanging people’s communication. Traditional media engages people in a one way connection. Referralsand promotions from the social word-of-mouth also cultivate the understanding of their customer base,which brings about business value for companies’ to develop their marketing and business strategies (Shen,Chen, & Wang, 2018).This project puts together the text mining techniques with social media to unveil travellers’ preference intheir mode choice of airport access. The motivations are twofold: first, to apply data since methodology tocollect and analyse social media data; second, to present past and current trends of transportation preferencesand their implications, hence, provide interesting insights.SignificanceThe objectives of this study are as follow:1.To analyse the concept of text-mining as a new approach to look into mode choices andtransportation2.To identify the explanatory variables for mode choice3.To find out travellers’ experience with the transportation system to and from Hong KongInternational Airport4.To analyse the preferred mode choice5.To determine the change of preferences over time and seasonal preferenceExpected OutcomeTravellers’ preference and experience with the access mode of Hong Kong International Airport areexpected to be found through parsing and analysing online data. Insights and trends are expected to bringrecommendations for enhancing the current system, policy and planning of airport access mode. Mostimportantly, give an outline of the approach of text-mining for finding airport access mode choice and setgrounds for a wider scope of study in the future.Literature ReviewAirport access mode choiceTo facilitate the advancement of airport management, gaining understandings of air passengers concerningairport access modes is of crucial importance. Alhussein (2011) has done the very first research on groundaccess modes choice to King Khaled International Airport (KKIA) in Riyadh, Saudi Arabia, aiming toanalyse access mode behaviour to KKIA. Tam, Tam and Lam (2005) examine the access mode choices ofdeparting passengers are studied to provide source information for transport operators to improve theirservices planning and increase their shares at the airport ground access market. Choo, You and Lee (2013)explored passengers’ airport access mode choice and hence developed mode choice models after conducting
Chi-square and ANOVA tests to identify key explanatory variables of the airports. All of these researchesdone have one thing in common: data are collected through conducting survey or face-to-face interview atthe terminals targeting departing passengers at random.In the research done by Tam, Tam and Lam (2005), not only did the structural relations between passengers’personal characteristics, trip characteristics were included, but also Expectation and Perception, the twolatent variables previous researches have not taken into account. Personal and trip characteristics includinggender, age, education level, flight length and travel cost all negatively impact the use of public transportmodes for airport ground access, also suggested by Alhussein (2011). Public transport dominate the toppreference of mode choice in Hong Kong, opposite to western countries. Visitors on business trip or visit theHKIA less frequently have a tendency to select private cars/taxi as their ground access mode choice. Resultshas indicated that respondents’ perceived levels of satisfaction are lower than their expectations on the fiveselected service attributes (franchised buss, AEL, private car, taxi, others). Passengers found travel timereliability as the most satisfactory service attribute; while waiting time of franchised buses, walking distanceto and from the Airport Express stations, travel cost for taxi and private car, as well as waiting time forairport shuttle buses offered by hotels and travel agencies all have a high priority for improvement.Alhussein (2011), Tam, Tam and Lam (2005) suggested that future studies could collect data to determinethe effects of travel seasons on airport ground access mode choice, with an inclusion of more serviceattributes, and the latent variables.Text mining and social mediaSocial media such as online forums have gain increasing popularity in exchanging ideas and advice.Discovering from the online communities could be rewarding. Park, Conway and Chen (2017) employed thetext mining ,qualitative analysis and visualization approach to compare online discussion content from threeonline mental health communities. Corpus was downloaded using Python Reddit API Wrapper (PRAW).Python Natural Language Toolkit and Scikit-learn was then used to pre-process the dataset – removing stopwords, punctuations, both high- and low-frequency terms, and tokenization. K-mean clustering followedafter to identify main discussion themes in a large collection of documents. The frequency of termappearance was then visualized as a bubble chart, proportional to the cluster size, by D3 and a networkvisualization by Gephi. Venn diagram was used to visualize the thematic overlaps among the three onlinecommunities. Qualitative comparison was carried out as a result. Louvain modularity algorithm (in Gephi)and heatmap visualization of Jaccard similarity scores were used as an illustration of how clusters aretopically similar and dissimilar from one another. The research findings facilitates more nuanced discussionsand encourage future researches to include multiple methods in fully understanding of differences amongconditions with shared symptomatology. Yet, the approach serves as a valuable take away for analysing andvisualizing textual comparisons. Social media is a modern day approach by which companies can enhancetheir popularity among maximum number people at very high speed. It can be said that this approach is mosteffective one as it contributes in communicating high number of people. It has been analysed that now adays people belonging to every generation is having their account on social media. That provides ensures thecompany that if they shares any information on this platform it will be transferred to everyone that meansfrom youth-old age people. As a result, this shared information will provide them business to company andcontribute in attaining desired targets. It will also maximise profitability of the company in effective mannerwithin less consumption of time. The site is also helpful in taking suggestions from customers as the userscan share their personal experience. On the basis of their experience, they also provide advices on theofficial account of company as it will help them out in improving weaknesses. By, improving these issues,company can work on the mentioned areas and enhance its quality that is being provided to customers ineffective manner. In context to text mining, it can be said that raw data can be used at this place in order toconvert it into meaningful data. Thus, it can be said that social media can be used here for sharing thiscollected information to maximum people.MethodologyText-mining
You’re reading a preview
Preview Documents

To View Complete Document

Click the button to download
Subscribe to our plans

Download This Document