Introduction to Simple Linear Regression: Article Review

Simple Linear Regression

Introduction to simple linear regression: Article review

Abstract

The use of linear regression is to predict a trend in data, or predict the value of a variable (dependent) from the value of another variable (independent), by fitting a straight line through the data. Dallal (2000), examined how significant the linear regression equation is, how to use it to draw the best fitting line of the scatter plot and how important the best fitting line is.

Introduction to simple linear regression: Article review

The use of linear regression is to predict a trend in data, or predict the value of a variable (dependent) from the value of another variable (independent), by fitting a straight line through the data. Linear regression represents a connecting link between the independent (carrier) variable and dependent (response) variable, which if graphed on X and Y-coordinates, results in a straight line. Linear regression shows the straight line which thoroughly represents, or predicts, the value of the response variable, given the noted value of the carrier variable (Frey, 2006). This essay aims at reviewing the article introduction to simple linear regression by Dallal (2000).

Problem statement

Dallal (2000) assumed a relationship between body mass (independent or carrier variable) and muscle strength (dependent or response variable), the more body mass the more muscle strength. However, this relationship is not without exceptions, which is reflected on the scatter plot of a regression model. Therefore, the author posed the question of how to illustrate the straight line, which accurately portrays the data, or predicts the value of the response variable.

Research purpose statement

In the given example, most cases would show a perfect regression. However, standardization of the procedure of putting in a straight line is necessary to provide better communication and common grounds for analysts working on the same data. Further, in the example regression equation given (Strength = -13.971 + 3.016 LBM [Lean Body mass]), one can draw two conclusions; first, a predicted muscle strength equals LBM multiplied by 3.016 minus 13.971. Second, the difference between muscle strength of two individuals is presumably 3.016 multiplied by the difference in their LBM.

Research questions

Research question 1: Why we need to fit a regression equation into a set of data?

It is clear from the previous example there are reasons for fitting a regression equation into a set of data. These are 1) to describe the data, and 2) to predict an independent (response) variable from a dependent (carrier) one.

Research question 2: What is the underlying principle of calculating a straight line?

If the points signaling data in a scatter plot are close to a line, it means the line represents, matches or gives a good fit of data. If not, then the line with most of the points closer to it that any other is the one that gives good fit of data. Further, If the is used to predict values, these values should close enough to the noted ones, in other words, residuals (observed values – predicted values) should small values.

Research question 3: How linear regression (least squares) equation is used to illustrate the best fitting line?

The standard used, as the name implies, is the sum of squared residuals (observed – predicted values) is minimal for the best fitting line. This applies to a line fitted to a set of sample data to promote generalization to a population from which this sample was taken. Yet for a population, there is a slightly different linear regression equation. The equation illustrates that an output (dependent) variable on the Y-axis can be predicted from an input (independent) variableson the X-axis after adding a random error (si).

Research question 4: Is the sample regression equation an accurate estimate of the population regression equation?

There is a reservation for accreditation of this statement, which is directed at the confidence bands in relation to the regression line. They are understood as the standard error of the mean (the standard deviation of the mean of the sampling distribution). Yet with one exception that is the sampling mean of the dependent variables amplifies as it adds distance from the mean.

Sources of data

Dallal (2000), stated in the second part of his article (linked to the main article) are cross- sectional data. This type of data has the advantages of being used if sampling method are not weighted and-or un-stratified. This method can also be used if the researcher is concerned only with minor or small probabilities. The longitudinal data results in more statistical power, however, in repeated cross-sectional analysis, new subjects added per analysis compensates for the inherent decreased statistical power (Yee and Niemeier, 1996).

Data collection strategies and methods

A good data collection strategy should have two objectives, namely, having motivated respondents (affected by time consuming, trust in statistics, difficulty of questionnaire, and benefit included). The second objective should be having high quality data, which tailored to sample individuals, sampling method and good instruments of data collection (Statistics Norway, 2007).

Methods of data collection are many and selection of a particular method depends on the available resources, reliability, resources of analysis and reporting, besides the skills and knowledge of the analyst. Some of these methods are case studies, behavior observation check lists, attitude, and opinion surveys, questionnaires distributed by mail, e-mail, or phone calls. Other methods of data collection include time series (evaluating one variable over a period of time as a week), and individual or group interviews (The Ohio State University Bulletin Extension, 2005).

Conclusions

Dallal (2000), inferred that simple linear regression means that we can predict a dependent variable from an independent one, so whenever we need to know from the beginning each time we add information. The regression line is important as it makes the estimation of a dependent variable more accurate and it allows the estimation of a response variable for individuals with values of the carrier variable not included in the data. The author also inferred there are two methods of predicting a variable either from within the range of values of independent variable of the sample given (interpolation) or outside this range (extrapolation). The author recommended the first method as it has the advantage of being safe, yet with concerns as regards the way to demonstrate the linearity of relationship between the two variables.

References

Dallal, G. (2000). Introduction to simple linear regression. Retrieved January 14, 2008, from http://www.tufts.edu/~gdallal/slr.htm.

Frey, B. (2006). Statistics Hacks. Sebastopol, CA: O’Reilly Media Inc.

Statistics Norway (2007). Strategy for data collection. Retrieved 04/07/2008, from http://www.ssb.no/vis/english/about_ssb/strategy/strategy_data_collection.pdf

The Ohio State University (2005). Bulletin Extension – Step Four: Methods of Data Collection. Retrieved 04/07/2008, from http://www.ohioline.ag.ohio-state.edu

Yee J L. and Niemeier D (1996). Advantages and Disadvantages: Longitudinal vs. Repeated Cross-Section Survey-A Discussion Paper. Project Battelle, 94, 16-22.

Japanese traditional game

Japanese traditional gameIntroduction

Given the task to innovate a Japanese traditional game, we decided to use the Two – Ten Jack and create our very own which is much simpler to be played. It uses part of the Uno cards and also a board with numbers to be placed with a bet. In order to continuously win the prizes, we construct the game to be in ways that a player must place a bet that is either same number, or same color that is the taken out from the deck of cards played with. The Two – Ten Jack game is played without the dealer and with points deducted and added which in the end, the player with the highest points balance. The next page would be the manual to the game and after that would be the manual to the Two – Ten Jack game. Furthermore, a comparison would be made to show the innovation of our game being born.

The Game Manual for the Two – Ten Jack
Preliminaries

The object of two-ten-jack is to get the most points by taking tricks containing positive point cards while avoiding tricks containing negative point cards.

Two players receive six cards each from a standard 52-card deck ranking0 1 2 3 4 5 6 7 8 9 and the remaining undealt cards are placed between the players to form the stock. Non-dealer leads the firsttrick and winner of each trick leads to the next. Players replenish their hands between tricks by each drawing a card from the stock with the winner of the last trick drawing first. Play continues until all of the cards in the entire deck have been played. Points are then tallied before the deck is reshuffled and dealt anew.

Following, Trumping, and Speculation

In two-ten-jack a player may lead any card and the other player must play a card of the same suit if able, or otherwise must play atrump cardif able. If a player has neither cards in the lead suit or trump, then any other card may be played. The highest trump card, or the highest card of the lead suit if no trumps were played, takes the trick

In two-ten-jack hearts are always thetrump suit and theace of spadesis a special trump card known asspeculationranking above all of the hearts. Rules for playing speculation are as follows:

If a trump (heart) is lead, a player may follow with speculation and must play speculation if no other trumps are held in the hand.
If a spade is lead, a player may follow with speculation and must likewise play speculation if no other spades are held in the hand.
If a club or diamond is lead and the other player has neither of these, speculation may be played, and must be played if no other trumps are available.
A player leading speculation must declare it as either a spade or trump.
Scoring and winning

Cards are worth the following point values:

2¦, 10¦and J¦are worth +5 each
2¦, 10¦ and J¦ are worth -5 each
2¦, 10¦, J¦ and A¦ are worth +1 each
6¦is worth +1 point
Hence the total number of card points per deal is +5. Winner is the first player to reach 31 points.
Game Manual
The number of players required to play this game is one to two players and maximum five players each round
Start by placing a single bet.
Each bet is place on a number between zero to nine and four different colors
Each time six cards would be pulled out from the deck
The bet is counted with sweets.
Each sweet cost RM1.
Each player starts with a sweet
The bet with the same color out of the 6 cards drawn will get his money
The bet with the same number out of the 6 cards drawn will get win 5 sweet.
The bet without same color or same number out of the 6 cards loses 1 sweet.
The bet with the same color and same number walks away with Rm50
The bet with same color and same number and also another same number but different color in the six card drawn from the deck walks away with Rm100
Game Rules
A player can only place one bet to a number and color per round.
Not more than 1 player can bet at a same number and color in each round.
A player has to verify his/her choice of bet before the opening of the six cards from the deck.
Comparison
The amount of cards used in Two – Ten Jack is 52 while the game we have created uses 40.
Also, the Two – Ten Jack is played between players while the game we have created uses a dealer.
Besides that, the Two – Ten Jack is played with a system of addition and subtraction while we tried to make it compatible by placing bets instead of tricking the other players.
Furthermore, the game we have created has been added with little elements of western card game like 21.

Statistics Essay: Interpreting Social Data

Interpreting Social Data

The British Household Panel Survey of 1991 measured many opinions, among otherthings, of the UK population. One of the questions asked was whether thehusband should be the primary breadwinner in the household, while the wifestayed at home. Answers to the questions were provided on an ordinal scale,progressing in five ordinances from Strongly disagree to Strongly agree.Results for each ordinance were recorded from male respondents and femalerespondents. Of survey respondents, 96.75, or N = 5325.162 answered thisquestion of a total survey population of N = 5500.829. 3.2%, or N = 175.667 ofsurvey respondents did not answer the question. In lay terms, this meansapproximately 97% of the survey respondents answered the question, while 3% didnot.

The study presents ordinal ranking, or ranking in a qualitative manner, of fivesets of concordant pairs of variables: the male and female count for those whostrongly agree the husband be the primary earner while the wife stays at home,the male and female count for those who agree, the male and female count forthose who are neutral, the male and female count for those who disagree, andthe male and female count for those who strongly disagree. The sexcross-tabulation presents numeric data for responses for each of the tenvariables, arranged in five variable pairs with male and female responses foreach variable pair. Data is presented in terms of number of responses for eachof the ten variables.

The counts or number of responses for each variable aredependent variables in the data analysis. We know they are dependentvariables because first, they are presented on the y-axis in the chartgraphically representing the data. Dependent variables are graphicallyrepresented on the y-axis, with independent variables presented on the x-axis.Causally it becomes more difficult to distinguish between dependent andindependent variables at first glance. Dependent variables usually change as aresult of independent variables. For example, if one were studying the effectof a certain medication on blood sugar in diabetics, the independent variablewould be the amount of medication given to the patient. In a test group orcohort of patients, each would be given a set dosage and their blood sugarresponses recorded. One patient may respond with a blood sugar reading of 110when given 20mg of medicine. Another day the patient, again given 20mg ofmedicine, may respond with a blood sugar reading of 240. The amount ofmedicine provided to the patient is fixed, or the independent variable. Theresponse of the patient is variable, and believed to be influenced by, ordependent on, the amount of medicine provided. The dependent variable wouldtherefore be the responding blood sugar reading in each patient.

In this survey, independent variables are the fivechoices of answers available to the survey takers. These five possibleresponses are presented to each survey respondent, just as the medicine isprovided to the patient in the example above. The respondent then chooses hisor her reply to the five possible answers, or chooses not to answer thequestion at all. The amount of those choosing not to answer at all, 3.2%, isconsidered statistically irrelevant in the analysis of this data. Data relatedto non-response is not considered from either an independent variable ordependent variable standpoint.

The amount of responses or response count for a givenindependent variable in the survey is a dependent variable. The response countwill change, at least slightly, from survey to survey. This could be a due tochange in survey size, response rate or number of those choosing to respond tothe statement, or possible minor fluctuation in percentage response for thefive answer possibilities. Although the statistical results of the responsesshould be similar, given a large enough and representative sample for eachsurvey attempt, some variance is likely to occur. The independent – dependentvariable relationship in the Husband should earn, wife should stay at homeanalysis is trickier to get one’s mind around than the medical example givenabove. In the medical example, it is easy to grasp how a medicine could affectblood sugar, and the resulting cause-effect relationship. In this survey, thecreation of five answer groups causes the respondents to categorise theiropinion into one of the groups, a much more difficult mental construction thanmore straightforward cause-result examples.

Fourexamples of dependent variables in these statistics are the number of men whoagreed with the statement (525), the number of women who agreed with thestatement (520), the number of men who disagreed with the statement (688), andthe number of women who disagreed with the statement (997). As describedabove, we know these are dependent variables because they are caused by theindependent variables, the five ordinal answer groups, in the survey.

Overall,empirical data for the results is skewed towards the Disagree / Stronglydisagree end of the survey. Three of the independent variables are ofparticular note. Strongly agree is the lowest response for both men and women,with Disagree being the highest response for both men and women althoughaccording to Gaussian predictions the Not agree/disagree variable should have thehighest distribution.

Inlay terms, the graphical representation of each of the five possible answersshould have looked like a bell-shaped curve. The two independent variables oneach end of the chart, Strongly agree and Strongly disagree, should have had alow but approximately equal response. The middle independent variable on thechart, Not agree / disagree, should have been the largest response. Thisshould have produced dependent variables of approximately 935 each for both menand women for the Not agree / disagree variable. Instead, the response for menwas 586, or 63% of typical distribution of answers. The response for women was702, or 75% of the typically distributed answers. The mean, or average, of allresponses in this survey is 1065.2, with the mean or average of male responsesbeing 464.6 and the mean or average of female responses being 600.6. Were theresponses distributed evenly amongst all five possible answers, these would bethe anticipated response counts.

Inexamining this data, a hypothesis can be put forth that the correlation betweenthe counts on two of the answer possibilities (two of the dependent variables)will be some value other than zero, at least in the population represented bythe survey respondents. This hypothesis can be tested using the ordinalsymmetric measures produced in the data analysis. As Pilcher describes, whendata on two ordinal variables are grouped and given in categorical order, wewant to determine whether or not the relative positions of categories on twoscales go together’ (1990, 98). Three ordinal symmetric measures, Kendall’stau-b, Kendall’s tau-c, and Gamma, were therefore calculated to determine ifthe order of categories on the amount of agreement to the question would helpto predict the order of categories on the count or amount of those selectingeach ordinal category. The most appropriate measures of association toevaluate this hypothesis are the two Kendall’s tau measures. The Kendall tau-cmeasure allows for tie correction not considered in the Kendall tau-b measure.The results of these measures, value .083 and .102 with approximate Tbof 6.75 indicate there is neither a perfect positive or perfect negativecorrelation between variables. Results do indicate a low level of predictionand approximation of sampling distribution. The correlation between two of thedependent variables is indeed a value other than zero, proving the hypothesiscorrect.

Three nominal symmetric measures were also calculated.These showed weak relationship between category and count variables, with avalue of only .096 for Phi, Cramer’s V, and Contingency Coefficient. Thesewere not used in testing the above hypothesis.

Atheory of distribution, Chebyshev’s theorem states that the standard of deviationwill be increased when data is spread out, and smaller when data is compacted.While the data may or may not present according to the empirical rule(bell-shaped), Chebyshev’s theorem contends that defined percentages of thedata will always be within a certain number of standard deviations from themean (Pilcher 1990).

Inthis example, data is compressed into five possible answer variables. The datadoes not present according to the empirical rule, but is skewed towards thedisagreement end of the variable scale. However, Chebyshev’s theorem doesapply relating to the distribution of data according to standard deviation fromthe mean for nine of the ten dependent variables. The response count of womenwho Disagree with the statement the Husband should earn, the wife stay at home,was proportionately larger than would be indicated along normal distribution.While the response count for men is also statistically high, it is not beyondthe predictions of Chebyshev’s theorem. If the survey had been conducted withfewer independent variables, say three ordinances instead of five, theresulting data would be more tightly compacted. If the survey had beenconducted with ten ordinances, the data would have been more spread out.

REFERENCES

Pilcher, D., 1990. Data Analysis forthe Helping Professions. Sage Publications, London.

Important and application of data mining

Important and application of Data MiningAbstract

Today, people in business area gain a lot of profit as it can be increase year by year through consistent approach should be apply accordingly. Thus, performing data mining process can lead to utilize in assist to make decision making process within the organization. This paper elaborate in detail the level of importance and also the application the application of data mining which can be adopt for various fields depends on the objective, mission, goals and purpose of conducting the study within the organization. there are three main areas take as a example which are hotel, library and hotel to observe on how data mining works to these main field.

Keywords: Data Mining, KDD Process, Decision Trees, Ant Colony Clustering Algorithm; Association Rules, Neural Network, Rough Set,

1.0 Introduction

As we know, organization which conducts business transaction is keeps massive of document or data in a specific database for further retrieval. The data are combine from are a few departments that carried out different task and each of their function parallel with the mission and vision of organization. According (Imberman, 2001) the number of fields in large databases can approach magnitudes of 102 to 103. Therefore, it is necessary to make proper decision making or strategic planning using the existing data where these plays important role in order to ensure any action that are taken place does not given an impact especially bring loss to the organization. Other than that, data became obsolete when it keeps on changing and easily out dated as the user requirement shifting depends on factors such as trends, money, needs and so forth.

One way to analyze data is using of data mining technique which enable to assist organization by emphasize several steps to produce the valuable output in short period of time compare with the traditional method which may involves more than one methodologies and it derive to longer of time to accomplish the investigation towards a portion of data. Thus, in the business area an action should be done quickly in order to compete with other competitors and to improve performance both in giving service and produce a high quality product. Moreover, process interpretation of the result involves group of people to inject some of the creativity and synthesis which can lead to the solutions on the problem or tasks.

Obviously, data mining a lot assist in various fields with different purposes and depend on the objectives that want to achieve. The rest of this paper is organized as follows. Section 2 tells about definition of data mining. Section 3 determines the importance of data mining. Section 4 explains the application of data mining in various fields. Section 5 draws the conclusions.

2.0 Definition of Data Mining

There are abroad definitions listed by a few researcher and academician according to their view and opinion based on the study they have done. Moreover, these will help to understand or giving an idea before discusses more in depth towards data mining technique.

Basically, the main purpose use of data mining is to manipulate huge amount of data either existence or store in the databases by determine suitable variables which is contribute to the quality of prediction that will be use to solve problem. Define by Gargano & Raggad, 1999.

“Data mining searches for hidden relationships, patterns, correlations, and interdependencies in large databases that traditional information gathering methods (e.g. report creation, pie and bar graph generation, user querying, decision support systems (DSSs), etc.) might overlook”.

Besides that, another author also agreed with opinion toward the data mining definition which is to seek hidden pattern, orientation and also trend. Through (Palace, 1996) added to the previous is:

“Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases”.

Moreover, data mining also define as process to squeeze of knowledge or information using appropriate framework or model to analyze until produce an output that assist in fulfill the objective of the study. From Imberman, 2001:

“As knowledge extraction, information discovery, information harvesting, exploratory data analysis, data archeology, data pattern processing, and functional dependency analysis”.

The statement above agreed and adds that the framework or model that adopt definitely to expose the real circumstance. Define by Ma, Chou & Yen, 2000:

“Data mining is the process of applying artificial intelligence techniques (such as advanced modeling and rule induction) to a large data set in order to determine patterns in the data”.

In the other hand, data mining is taken a few steps during analysis and this step is depending on the methodology that is chosen. Each of the methodology is not much differ from other methodology. Through Forcht & Cochran, 1999:

“Data mining is an interactive process that involves assembling the data into a format conducive to analysis. Once the data are configured, they must be cleaned by checking for obvious errors or flaws (such as an item that is an extreme outlier) and simply removing them”.

3.0 Important of Data Mining

As discusses above, it can be seen that data mining will be beneficial a lot of party and multiple range of level in the organization as the model or framework that is apply can reduce time and cost. Then, the results allow the responsible knowledge worker to transform into the strategic value of information effectively by critically analyze the result.

The process should be done carefully to avoid the useful variables or algorithm being removes or not be included in the extraction of reliable data. Data mining techniques will help in select a portion of data using appropriate tools to filter outliers and anomalies within the set of data. According to Gargano & Raggad, 1999, there are a few others important of data mining consist of:

· To facilitate the explication of previously hidden information includes the capabilities to discover rules, classify, partition, associate and optimize.

According to (Goebel & Gruenwald, 1999) in order to seek the pattern of data, a few methodologies are use in clarify the vagueness as well as to identifying the relation among one variables and other variables within the databases whereas the outcome will guide in making decision or to forecast the impact when the action were take into consideration. The chosen of methodologies should be determined in a proper way suit with the rules and condition towards the data which is to be analyzed. The methodologies include:

Statistical Methods: focused mainly on testing of preconceived hypotheses and on fitting models to data.
Case-Based Reasoning (CBR): technology that tries to solve a given problem by making direct use of past experiences and solutions.
Neural Networks: formed from large numbers of simulated neurons, connected to each other in a manner similar to brain neurons which enables the network to “learn”.
Decision Trees: each non-terminal node represents a test or decision on the considered data item and can also be interpreted as a special form of a rule set, characterized by their hierarchical organization of rules.
Rule Induction: Rules state a statistical correlation between the occurrences of certain attributes in a data item, or between certain data items in a data set.
Bayesian Belief Networks: graphical representations of probability distributions derived from co-occurrence counts in the set of data items.
Genetic algorithms / Evolutionary Programming: formulate hypotheses about dependencies between variables, in the form of association rules or some other internal formalism.
Fuzzy Sets: constitute a powerful approach to deal not only with incomplete, noisy or imprecise data, but may also be helpful in developing uncertain models of the data that provide smarter and smoother performance than traditional systems.
Rough Sets: rough sets are a mathematical concept dealing with uncertainty in data and used as a stand-alone solution or combined with other methods such as rule induction, classification, or clustering methods

· The ability to seamlessly automate and embed some of mundane, repetitive, tedious decision steps not requiring continuous human intervention.

Several steps are taken in processes or analyzes on selected data where the process involves of filtering, transforming, testing, modeling, visualization and documented the result or store accordingly in the databases or data warehouse. Each of the steps functions differently and has responsibility in carries out the process with the purpose to easier and produce the high quality of assumption by automate generate towards specific conditions. For example, data warehouse also keep previous analysis and this allow eliminating the redundant output at certain steps. Through Ma, Chou & Yen, 2000, they stress the characteristics of data mining define how it assist to reach the end process of analyzing. It comprises:

Data pattern determination: Data-access languages or data-manipulation languages (DMLs) identify the specific data that users want to pull into the program for processing or display. It also enables users to input query specifications. Therefore, users simply select the desired information from the menus, and the system builds the SQL command automatically.
Formatting capability: It generates raw data formats, tabular, spreadsheet form, multidimensional-display and visualization.
Content analysis capability: Data mining also has a strong content analysis capability that enables the user to process the specifications written by the end-users.
Synthesis capability: Data mining allows data synthesis to be timely executed.

· Simultaneously reducing cost and potential error encountered in the decision making process.

Basically, data mining can minimize the error of forecasting by following the steps of selected methodology in well manner to avoid delaying in making decision where this situation will giving big impact for the business area. Therefore, it must be careful in handling the data throughout the steps involves whereby the strategic plan should take into consideration includes of the objectives to done the analysis, the amount of data, the variables, the relationship between variables, test adopted, and so forth. Moreover, if there is need to discuss with the professional towards the study conducted and it should be included in the planning part. In the context of organization, usually a unit or group of people are given responsible to carries this duty to discover the hidden pattern for another department. Hence, the continuously meeting should be done between the professional and researchers to ensure the end result fulfill their requirement as well as to improve the performance of worker, department and organization.

In term of reducing a cost, compare to the traditional research which take time in acquiring the data from respondents and it depend on the methodologies that are use and the number of sampling. If the questionnaire method, it can be done quickly and less time consuming but if the interviewing method is adopted, it surely take time and researcher have to meets the respondent more than one time, if there is an ambiguity or the answers not meet with the requirement. For certain study, the sampling are involves from the different location which require the researcher to travel in order to gain the genuine opinion from them and this will cost a lot involves of accommodation, food, flight ticket and so forth. For data mining, it uses the existence of data (for example, data of customer transaction, data of student registration, data of patient undergo the operation process and so on) that keep in data warehouse which mostly reduce cost in aspect of acquiring data. Other than that, researcher take first action by search for the study in the data warehouse when the objective being determine at the beginning of study because previous study are store in the data warehouse. If it is found tally, a few step will be skip or easily decided towards the data and it prove that data mining can reducing the cost as well as time. Refer to Gargano & Raggad, 1999, data mining also derive long term benefit which the cost incurred due to the development, implementation, and maintenance of such systems by a wide margin.

4.0 The application of Data Mining

Nowadays, data mining is widely use especially to those organization that focuses on consumer orientation. For example, retail, financial, communication, and marketing organizations (Palace, 1996). Besides it, healthcare area also gain benefit by apply the data mining into the daily operations. These various of field shows each of the organization carries different transaction where all of details keep in the databases which enables to perform analysis for multiple purpose likes to increase revenue, gain more customer, improve customer satisfaction and others. Moreover, again through (Palace, 1996) the existence data allow to determine relationships among internal factor consists price, product positioning or staff skills and external factor consists economic indicators, competition and customer demographic.

Hence, there three examples of data mining’s application in different areas which are hotel sector, library scope and also hospital with the goals to reduce or eliminate the weakness by address it using the result that is interpret in well manner to assist in making decision for the best solutions. The examples are as follows:

· A data mining approach to developing the profiles of hotel customers.

A study conduct by Min, Min & Ahmed Emam, 2002 with the objective to target some of the valued customers for special treatment based on their anticipated future profitability to the hotel. There are a few questions regarding to the customer profiling:

Which customers are likely to return to the same hotel as repeat guests?
Which customers are at greatest risk of defecting to other competing hotels?
Which service attributes are more important to which customers?
How to segment the customer population into profitable or unprofitable customers?
Which segment of the customers’ best fits the current service capacities of the hotels?

The researchers adopt decision trees for analyzing the data from the abroad method of data mining methodology because the ability to generate appropriate rules using visualization and simplicity. There are three steps having to follows in this process and it includes:

Data collection: the process of select data that suit with objective from the previous survey. Moreover, remove the unwanted data from databases by filtering out the excel file.
Data formatting: the process of converted all data in the spreadsheet to Statistical Packages for Social Sciences (SPSS) for the purpose of classification accuracy.
Rules induction: the process of selection of algorithms to building decision trees which is C5.0 to generate sets of rules that bring important clues in order for hotel manager to take further action.

As the result, the researcher found that “if-then” rules as a useful in formulating a customer retention strategy with a predictive ranging from 80.9 per cent to 93.7 per cent whereas a predictive accuracy reflect to the rules conditions that affect by times (percentage).

· Using data mining technology to provide a recommendation service in the digital library.

A study conducted by Chen & Chen, 2006 with the purpose to provide recommendation system architecture to promote digital library service in electronic libraries. There are abroad of digital publication format likes audio, video, picture, etc. thus, it lead difficulties in analyzing or defining the keyword and content in order to gain information from the user to improve the service in the digital libraries.

In the methodology section, there are two data mining models selected which consist

o Ant Colony Clustering Algorithm;

This model is capable to find the shortest path or reduce time to find the best output fit with the problem that existence in the organizations. Each of the steps has different function to enable they too see the relation among the variables It takes a few steps which are:

Step 0: parameters and initialize pheromone trails.

Step 1: Each ant constructs its solution

Step 2: Calculate the scores of all solutions

Step 3: Update the pheromone trails.

Step 4: If the best solution has not been changed after some predefined iterations, terminate the algorithm; otherwise go to step 2.

o Association rules to discover the hidden pattern.

This model enables to find co-purchase items and assist in uncovered relationship algorithms in form of association rules. There are two main steps as follows:

Step 1: Find all large item sets

Step 2; use the large items set generated in the first step to generate all the effective association rules.

As the results, these two models encounter more than one solutions and enable to gain a lot of recommendation that can be manipulate into various problem that exists in conducting digital libraries as well as to promote the usage in multiple level of user using the appropriate mechanism and providing suitable services.

· Using KDD process to forecast the duration of surgery.

A study conducted by Combas, Meskens & Vandamme, 2007 with the aim is to identify classes of surgery likely to take different lengths of time according to the patient’s profile as well as to allow the use of the operating theatre to be better scheduled. There are many issues arise in this field that lead to the study. For example, an endoscopy unit use of endoscopy tube (shared resources) during the surgery. However their availability is limited because it takes 30-45min to clean and sterilize each one. The scheduling of endoscopies (and all other operating theatre procedures) must obviously take into account the availability of these different resources.

The researchers adopt Knowledge Discovery in Databases (KDD) process to analyze this massive data from the databases. The step as follows:

Step 1: data preparation which the selected data must be fulfill of requirement includes secondary diagnoses, “Previous active history” and system affected.
Step 2: data cleaning where filter data by concerning surgical procedures that had been performed at least 40 times (at least 20 times for combinations involving both surgery and specific surgeons).
Step 3: data mining which to decide appropriate method to test on the portion of data which it involves rough set and neural network.
Step 4: validation by comparison consist process of interpretation by comparing the result from two methods that perform data analysis in order to observe the rate of good classification.

Then, researcher added up another three steps in order to fit with the objective that is proposed and to produce the best outcomes to forecast the durations of surgery. It consists of:

o Step 5: Measuring the impact of predicting the duration of surgery on planning which in this step the duration of surgery supplied by the prediction models (empirical laws, rule-based laws, etc.) based on information stored in the database is used to feed a series of algorithms and heuristics for planning purposes
o Step 6: Simulation involves the present time will allow to simulate the activity of the different theatre suites in terms of the operating sequence determined by planning methods on the two scenarios which are operating data and patient’s profile
o Step 7: validation & selection of the best model where the results supplied by the simulation model should enable to assess the quality of scheduling on the basis of a series of performance indicators likes the length of time for which the operating theatres are not in use, the number of potential additional hours, and errors in predicting the duration of surgery.

As the results, researchers are not particularly satisfactory. The main problem seems to be the choice of variable grouping, which might possibly have an effect on prediction quality.

5.0 Conclusion

As a conclusion, data mining can be consider as an effective and efficient way to discover or to transform the invisible to visible data that retrieve from databases which have capabilities to store huge amount of data by using the right tools in assist or enable to analyze, synthesis and manipulate the content of data for various purposes and often depend on the main businesses that carries out to define the target.

From the discussion above, it can be seen that there are a lot of advantages when perform data mining especially in the business area which allow the organization to predict the trends, customer requirement, the relationship and so forth as early preparation can be identify in order to seek another or a few others way to ensure that organization can still operate their daily operation after determine that organization not agree towards the result have been gain.

In order to produce the end result that satisfying the organization and minimize the error as it successfully implement the information in order to perform business transaction. The key variables should be assign in well manner meet or suitable with the objective that propose in conducting the study because it have to repeat the procedures when found the errors as the decision making process could not been done according to the timeline.

6.0 References

Chen, Chia-Chen & Chen, An-Pin. (2006 ). Using data mining technology to provide a recommendation service in the digital library. The Electronic Library. 25(6): 711-734.

Combas, C., Meskens, N & Vandamme, J. P. (2007). Using a KDD process to forecast the duration of surgery. International Journal of Production Economics. 112: 279-293.

Forcht., Karen A. & Cochran, Kevin. (1999). Using data mining and datawarehousing techniques. Industrial Management & Data Systems. 99(5), 189-196.

Gargano., Michael L. & Raggad, Bel G. (1999). Data mining – a powerful information creating tool. OCLC Systems & Services. 15(2), 81-90.

Goebel, Michael & Gruenwald, Le. (1999). A survey of data mining and knowledge discovery software tools. ACM SIGKDD Explorations Newsletter. 1: 20 – 33.

Imberman, Susan P. (2001) Effective Use of the KDD Process and Data Mining for Computer Performance Professionals. in International Computer Measurement Group Conference. Anaheim: USA, 611-620.

Ma, Catherine, Chou, David C. &.Yen, David C. (2000). Data warehousing, technology assessment and management. Industrial Management & Data Systems. 100(3), 125-135.

Min, Hokey., Min, Hyesung & Ahmed Emam. (2002). A data mining approach to developing the profiles of hotel customers. International Journal of Contemporary Hospitality Management. 14(6): 274-285.

Palace, Bill. (1996, Spring). Data Mining: What is Data Mining? retrieved March 2, 2010, from: http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

Impact of Social Determinants on Health

Song et al (2011) studied the influence of social determinants of health on disease rates. They specified AIDS as the disease of concern and utilized data from American Community Survey. They used correlation and partial correlation coefficients quantify the effect of socioeconomic determinants on AIDS diagnosis rates in certain areas and found that the AIDS diagnosis rate was mutually related with kind, marital status and population density. Poverty, education level and unemployment also determine the cause of disease in an individual.

In developed and developing countries socioeconomic status proved to be an important cause of cardiovascular disease. Survey studies showed that education was the most important socioeconomic determinant in relation to cardiovascular risk factor. Smoking was also a major cause of cardiovascular disease. Low socioeconomic status had a direct relationship with higher levels of cardiovascular risk factors (Yu et al, 2000; Reddy et al, 2002; Jeemon & Reddy, 2010; Thurston et al, 2005; Janati et al, 2011 and Lang et al, 2012).

Lantz et al (1998) investigated the impact of education, income and health behaviors on the risk of dying within the next 7.5 years with longitudinal survey study. The results of cross tabulation showed that the mortality rate has a strong association with education and income.

Habib et al (2012) conducted a questionnaire based survey to measure the social, economic, demographic and geographic influence on the disease of bronchial asthma in Kashmir valley. After analysis in SPSS they concluded that non smokers, males working in farms and females working with animals have a high incidence of Bronchial Asthma. The study also showed a significant relationship between the age and disease.

Arif and Naheed (2012) used “The Pakistan Social and Living Standard Measurement Survey 2004-05” conducted by the Federal Bureau of Statistics to determine the socioeconomic, demographic, environmental and geographical factors of diarrhea morbidity among the sampled children. Their study found a relationship between diarrhea morbidity and economic factors particularly ownership of land, livestock and housing conditions. Child’s gender and age, total number of children born, mother’s age and education and sources of drinking water did show significant effect on the diarrhea morbidity among children.

Aranha et al (2011) conducted a survey in Brazil’s district Sao Paulo, to determine the association between children’s respiratory diseases reported by parents, attendance at school, parents’ educational level, family income and socioeconomic status. By applying chi square test they concluded that the health of children is associated with parents’ higher education, particularly mothers. Family income, analyzed according to per capita income did not affect the number of reports of respiratory diseases from parents.

Deolalikar and Laxminarayan (2000) used data from 1997 Cambodia Socioeconomic Survey to estimate the influence of socioeconomic variables on the extent of disease transmission within villages in Cambodia. They concluded that infectious diseases were the leading cause of morbidity in the country. Younger adults were less likely to get infected by others, but it increased with age. Income and the availability of a doctor had a significant effect on disease transmission.

Survey studies based on different countries showed a strong association between socioeconomic factors (income, education and occupational position) and obesity. After analysis there was a significant effect of consumption of low quality food due to economic factors on increased obesity. For men, both the highest level of occupational position and general education completed were found to have a significant effect on obesity while women in the lowest income group were three times as likely to be obese as women in the highest income group (Kuntz and Lampert, 2010; Akil and Ahmad, 2011 and Larsen et al, 2003).

Yin et al (2011) used data from the 2007 China Chronic Disease Risk Factor Surveillance of 49,363 Chinese men and women aged 15-69 years to examine the association between the prevalence of self-reported physician diagnosed Chronic Obstructive Pulmonary Disease (COPD) and socioeconomic status defined by both educational level and annual household income. Multivariable logistic regression modeling was performed. Among nonsmokers, low educational level and household income were associated with a significant higher prevalence of COPD.

Siponen et al (2011) tried to study the relationship between the health of Finnish children under 12 years of age and parental socioeconomic factors (educational level, household income and working status) by conducting population based survey. The analysis was done by using Pearson’s Chi-Square tests, and logistic regression analysis with 95% confidence intervals. The results showed that parental socioeconomic factors were not associated with the health of children aged under 12 years in Finland.

Washington State Department of Health (2007) examined Washington adults and inferred that adults with lower incomes or less education were more likely to smoke, obessed, or ate fewer fruits and vegetables than adults with the broader culture, higher incomes and more education. In cultures where smoking was culturally unacceptable for women, women died less often from smoking-related diseases than women in groups where smoking was socially accepted. Lack of access to or inadequate use of medical services, contributed to relatively poorer health among people. In lower socioeconomic position groups health care received by the poor was inferior in quality. People of higher socioeconomic position had larger networks of social support. Low levels of social capital had been associated with higher mortality rates. People who experienced racism were more likely to have poor mental health and unhealthy lifestyles.

Hosseinpoor et al (2012) took self-reported data, stratified by sex and low or middle income, from 232,056 adult participants in 48 countries, derived from the 2002–2004 World Health Survey. A Poisson regression model with a robust variance and cross tabulations were used deducing the following results. Men reported higher prevalence than women for current daily smoking and heavy episodic alcohol drinking, and women had higher growth of physical inactivity. In both sexes, low fruit and vegetable consumption were significantly higher.

Braveman (2011) concluded that there was a strong relationship between income, education and health. Health was improved if income or education increased. Stressful events and circumstances followed a socioeconomic incline, decreased as income increased.

Lee (1997) examined the effects of age, nativity, population size of place of residence, occupation, and household wealth on the disease and mortality experiences of Union army recruits while in service using Logistic regression. The patterns of mortality among recruits were different from the pattern of mortality among civilian populations. Wealth had a significant effect only for diseases on which nutritional influence was definite. Migration spread communicable diseases and exposed newcomers to different disease environments, which increased morbidity and mortality rate.

Ghias et al (2012) studied the patients having HCV positive living in province of Punjab, Pakistan. Socio-demographic factors and risk factors were sought out using questionnaire. Logistic regression and artificial neural network methods were applied and found that patient’s education, patient’s liver disease history, family history of hepatitis C, migration, family size, history of blood transfusion, injection’s history, endoscopy, general surgery, dental surgery, tattooing and minor surgery by barber were 12 main risk factors that had significant influence on HCV infection.

REFERENCES

Song, R. et al (2011) “Identifying The Impact Of Social Determinants Of Health On Disease Rates Using Correlation Analysis Of Area-Based Summary Information” Public Health Reports Supplement 3, Volume 126, 70-80.
Yu, Z. et al (2000) “Associations Between Socioeconomic Status And Cardiovascular Risk Factors In An Urban Population In China” Bulletin of the World Health Organization Volume 78, No. 11, 1296-1305.
Reddy, K. et al (2002) ” Socioeconomic Status And The Prevalence Of Coronary Heart Disease Risk Factors” Asia Pacific J Clin Nutr Volume 11, No. 2, 98–103.
Jeemon, P. & Reddy, K. (2010) ”Social Determinants Of Cardiovascular Disease Outcomes In Indians” Indian J Med Res Volume 132, 617-622.
Thurston, R. et al (2005) “Is The Association Between Socioeconomic Position And Coronary Heart Disease Stronger In Women Than In Men?” American Journal of Epidemiology Volume 162, No. 1, 57-65.
Janati, A. et al (2011) “Socioeconomic Status and Coronary Heart Disease” Health Promotion Perspectives Volume 1, No. 2, 105-110.
Lang, T. et al (2012) “Social Determinants Of Cardiovascular Diseases” Public Health Reviews Volume 33, No. 2, 601-622.
Lantz, P. et al (1998) “Socioeconomic Factors, Health Behaviors, and Mortality” JAMA Volume 279, No. 21, 1703-1708.
Habib, A. et al (2012) “Socioeconomic, Demographic and Geographic Influence on Disease Activity of Bronchial Asthma in Kashmir Valley” IOSR Journal of Dental and Medical Sciences (JDMS) ISSN: 2279-0853, ISBN: 2279-0861, Volume 2, No. 6, 04-07.
Arif, A. and Naheed, R. (2012) “Socio-Economic Determinants Of Diarrhoea Morbidity In Pakistan” Academic Research International ISSN-L: 2223-9553, ISSN: 2223-9944 ISSN-L: 2223-9553, ISSN: 2223-9944, Volume 2, No. 1, 490-518.
Aranha, M. et al (2011) “Relationship Between Respiratory Tract Diseases Declared By Parents And Socioeconomic And Cultural Factors” Rev Paul Pediatr Volume 29, No. 3, 352-356.
Deolalikar , A. and Laxminarayan, R. (2000) “Socioeconomic Determinants of Disease Transmission in Cambodia” Resources for the Future Discussion Paper, 00–32.
Kuntz, B. and Lampert, T. (2010) “Socioeconomic Factors and Obesity” Deutsches Arzteblatt International Volume 107, No. 30, 517-22.
Akil, L. and ; Ahmad, H. (2011) “Effects Of Socioeconomic Factors On Obesity Rates In Four Southern States And Colorado” Ethnicity & Disease Volume 21, 58-62.
Larsen, P. et al (2003) “The Relationship of Ethnicity, Socioeconomic Factors, and Overweight in U.S.Adolescents”OBESITY RESEARCH Volume 11, No.1, 121-129.
Yin, P. et al (2011) “Prevalence Of COPD And Its Association With Socioeconomic Status In China: Findings From China Chronic Disease Risk Factor Surveillance 2007” BMC Public Health Volume 11, 586-593.
Siponen, M. et al (2011) “Children’s Health And Parental Socioeconomic Factors: A Population-Based Survey In Finland” BMC Public Health Volume 11, 457-464.
Washington State Department of Health (2007) “Social and Economic Determinants of Health” The Health of Washington State Volume 1, No. 3, 01-07.
Hosseinpoor, A. et al (2012) “Socioeconomic inequalities in risk factors for noncommunicable diseases in low-income and middle income countries: results from the World Health Survey” BMC Public Health Volume 12, 912-924.
Braveman, P. (2011) “Accumulating Knowledge on the Social Determinants of Health and Infectious Disease” Public Health Reports Supplement 3, Volume 126, 28-30.
Lee, C. (1997) “Socioeconomic Background, Disease, and Mortality among Union Army Recruits: Implications for Economic and Demographic History” Explorations in Economic History Volume 34, 27-55.
Ghias, M. et al (2012) “Statistical Modelling and Analysis of Risk Factors for Hepatitis C Infection in Punjab, Pakistan” World Applied Sciences Journal Volume 20, No. 2, 241-252.

Impact of Smartphones on Students

Problem Statement

With the advanced technology nowadays, smartphone is viewed as a important device and an integral part of the Malaysian society. According to The Sun Daily Report, a last year concluded analysis revealed that Malaysia’s smartphone penetration has increased to 63% in year 2013 from 47% in 2012, while tablet penetration has increased almost three times to become 39% from 14% (Afrizal, 2013). University students are among the highest contributors to the increasing number of smart phone sales (Jacob & Isaac, 2008). However, often use of smartphone can become a habit or dependency of student and indirectly affect their lifestyle. There are several general aspects of lifestyle have been categorized, such as health, education, psychology, socialization and security, in which may be in the positive side or the negative side.

Regarding impact of smartphone in business field, Rashedul Islam, Rofiqul Islam & Tahidul Arafhin Mazumder (2010) states that the drastic growth of the businesses during past few years is mainly because the rising use of smartphones and the mobile application. Smartphone has made the feature of advertising business sector becomes interesting and effective. However, the negative impact of smartphone is towards the PCs market as shown in survey result of year 2011, smartphone’s shipment in that full year was 487.7 millions, exceeds PCs with 17.63%. Smartphones nowadays are much more fomidable than the PCs that 10 more years ago, people are now using the smartphone to check news feed, status update and photo posting as well (Mogg, 2012). Microsoft-Intel Alliance as the long dominated of PCs also faced pressure to get into the market of mobile device. Soon, PCs may be replaced by smartphone as smartphone seems to have a optimistic growth in the future although there is still million sales of PCs in every year (eWeek, 2012).

Accordingly health surveys regarding smartphone done by Sarwar & Soomro (2013), most of the users in USA use smart phone to search for the information and facilities related to health. Many health mobile applications are available stimulate users for prescription management, encourage other options of treatment, offer price comparison and verification of prescriptions as well. However, Russian and Eastern European scientists issued the earliest reports that low level exposure to RF radiation of smart phone could cause a wide range of health effects, including behavioural changes, effects on the immunological system, reproductive effects, changes in hormone levels, headaches, irritability, fatigue, and cardiovascular effects (Russian National Committee, n.d.). In addition, research of World Health Organization suggested this behaviour is similar to a compulsive-impulsive disorder, whereby an inability to access the services are associated with negative health consequences, including withdrawal and depression and other negative repercussions such as social isolation and fatigue (WHO, 2011). According to Coleman (2013), smartphones can also contribute to the deterioration of our eyes, squash our spines, give us saggy jowls, damage our hearing, damage our sleep cycle and cause dark circles under our eyes.

Meanwhile, in term of education, Sarwar & Soomro (2013) indicated that smartphone has provided society to be exposed towards huge amount of educational and learning purposes due to internet availability and increasing demand of smartphone. Regarding the survey of King (2012), majority of the American adults think that smartphone usage contributes a positive impact towards the youth’s education in America, eg. E-readers for study purpose. Students with the help of technology are able to access educational programs (Font, 2013). For instance Dell has launched Youth Learning (an alphabetization initiative) which support the learning programs. Besides that, smartphone provides a basic human need to help students relieve their boredom and decompression between tasks (Shawn Knight, 2012). However, there is some negative impact of smartphone dependency on education. Over dependency of smartphone can leads to addiction, means although there is no real need’s communication, still hope to have constant communication with outside through social network. (Lee, 2012). According to the The Times of India: Health, (2013) experts said that our memory will be reduced and cognitive thinking will be killed when using the smartphonealthough it makes the life more convenient and easier. People now depend much on the search engine through smartphone cause them to become poor thinker and getting more lazy than before.

For impact onto psychology, based on another research of Sarwar & Soomro (2013) conducted, there is a positive impact onto human psychological, smartphone is used for reduction of tension work life. Nowadays, keep update with the latest news is very vital process for reducing tension. However, negative impact of smartphone dependency exists. Spending more than seven hours a day using smartphones and experiencing symptoms such as anxiety, insomnia and depression when cut off from the device is considered as addiction (Nam, 2013). Students who are addicted to smartphones not only distract themselves from studies, but also damage their interpersonal skills. According to Sarwar and Soomro (2013), addiction to smartphones affects our quality of sleep as well as creates friction in our social and family life.

For lifestyle of socialization, the survey of Yi-Fan Chen done in U.S. College shows that students have several strong socialization motives for using the mobile phone to contact both family and friends (CHEN, 2007). Smartphone features, for examples, text to speech, GPS and social Websites, people can easily remain integrated with society especially those with special needs and elderly age (Sarwar & Soomro, 2013). However, the report of Amanda (2012) shows that over dependent of smartphone brings the impact of there is only 35% of the teens who owns a smart phone have face-to-face socializing outside of school. According to Teoh (2011), Americans are socializing and spending the average time of 2.7 hours per day on their mobile device. The time people used to socialize via mobile device is twice of the time spending on eating and is more than one third of time spending on sleeping per day.

For impact of smartphone on security, Sarwar & Soomro (2013) stated that safety of children can be known by parents with the availability of Internet connection through a Smartphone. Furthermore, by setting up password security, it can protect the sensitive data inside the smartphone and also restricts access in case the smartphone was lost or been stolen. (BullGuard Security Centre, 2013) According to Enisa’s report (2010), the data leakage from smartphone may affected our assets throughout such as personal data, corporate intellectual property, classified information, financial assets and etc. If smart phone user lost the smartphone, for example, every information like address, e-mail, log data in web browser, SMS (Short Message Service) and etc. can be exposed if there is no appropriate security solutions (Smith, 2011). Next target for criminal attacks likely will be smartphone and social networking site (Sarwar & Soomro, 2013). According to WhoCalledMyPhone.Net (as cited in Darrell, 2013), 24% of smartphone users check their phone while driving, which can directly cause accidents or fatal accidents.

In short, smartphone has contributed positive impacts to human, but too much dependent on smartphone also cause negative consequences. Hence, our study will put more effort on the impacts of smartphone dependency into lifestyle. Smartphone brings impact to various fields such as business, health, education, psychology, socialization and security as well. However, during our research, the target of study area is among undergraduate students in UUM. Hence, some fields are not suitable for students for instance business. In short, there are only five lifestyles which will be used for our survey, include health, education, psychology, socialization and security.

References

Afrizal. (2013, September 5). Malaysia’s smartphone penetration rises by 16%. The SunDaily. Retrieved March 2, 2014 from http://www.thesundaily.my/news/820932

Amanda, L. (2012). Teens, Smartphones & Texting. Pew Research Center’s Internet & American Life Project , pp. 1-34.

BullGuard Security Centre. (2013). Eight ways to keep your smartphone safe: Mobile Security. Retrieved March 23, 2014, from http://www.bullguard.com/bullguard-security-center/mobile-security/mobile-protection-resources/8-ways-to-keep-your-smartphone-safe.aspx

CHEN, Y.-F. (2007). The mobile phone and socialization: The consequences of mobile phone use in transitions from family to school life of U.S. college students . Journal of Cyber Culture and Information Society , pp. 1-152.

Coleman, C. (2013, July 21). How your mobile can give you acne…not to mention asaggy jaw and sleepless nights. Daily Mail. Retrieved March 18, 2014, fromhttp://www.dailymail.co.uk/femail/article-2372752/How-MOBILE-acne–mention-saggy-jaw-sleepless nights.html?ITO=1490&ns_mchannel=rss&ns_campaign=1490

Darrell, R. (2013). The impressive effects of smartphones on society (infographic). Bit Rebels. Retrieved March 18, 2014, from http://www.bitrebels.com/technology/the-effects-of-smartphones-on-society/

eWeek, September 5, 2012, ”Intel Microsoft Influence Declining as Smartphones Tablets Rise Analysts 342948”, http://business.highbeam.com/137475/article-1G1-301713950/intelmicrosoft-influence-declining-smartphones-tablets

ENISA (n.d.). Top Ten Smartphone Risk. Retrieved 17 March 2014, from http://www.enisa.europa.eu/activities/Resilience-and-CIIP/critical-applications/smartphone-security-1/top-ten-risks

Gehi, R. (2013, December 3). Your smartphone is destroying your memory. The Times of India. Retrieved 23 March, from http://timesofindia.indiatimes.com/life-style/health-fitness/health/Your-smartphone-is-destroying-your-memory/articleshow/19412724.cms

Jacob, S.M. and Isaac, B. (2008).The mobile devices and its mobile learning usage analysis. Proceedings of the International Multi-conference of Engineers and Computer Scientists, Hong Kong, Vol. 1, March, 19-21, pp. 782-87.

King, R. (2012). Mobile devices have positive impact on education, survey says. Retrieved from http://www.zdnet.com/blog/btl/mobile-devices-have-positive-impact-on-education-survey-says/68028

Knight, S. (2012, September 26). Retrieved March 17, 2014, from http://www.techspot.com/news/50310-smartphones-cure-boredom-but-is-that-necessarily-a-good-thing.html

Lee, C.-s. (2012). Smartphone addiction: disease or obsession? Retrieved March 18, 2014, from Korea Times: http://www.koreatimes.co.kr/www/news/opinon/2012/11/298_117506.html

Md. Rashedul Islam, Md. Rofiqul Islam,Tahidul Arafhin Mazumder. (2010). Mobile

Application and Its Global Impactaˆ-, International Journal of Engineering & Technology, IJETIJENS, Vol: 10, No:06, http://www.ijens.org/107506-0909%20ijet-ijens.pdf

Mogg, T. (2012). “Smartphone sales exceed those of PCs for first time, Apple smashes

record”. Digital Trend. Retrieved from http://www.digitaltrends.com/mobile/smartphone-sales-exceed-those-of-pcs-for-first-time-apple-smashes-record/

Nam, I. (2013, Jul 23). A rising addiction among youths: Smartphones.Wall StreetJournal (Online). Retrieved March 18, 2014, from http://eserv.uum.edu.my/docview/1411097432?accountid=42599

Russian National Committee on Non Ionizing Radiation Protection , Sanitary Rules of the Ministry of Health (Russia): SanPin 2.1.8/2.2.4.1190-03 point 6.9.

Sarwar, M., & Soomro, T.R. (2013, March). Impact of smartphone’s on society. European Journal of Scientific Research, 98 (2), 216-226. Retrieved March 18, 2014, from http://www.europeanjournalofscientificresearch.com/

Smith, M. (2011). A Practical Analysis of Smartphone Security. Salvendy (Eds.): Human Interface, Part I , pp. 311–320.

Font, S. (2013). How smartphones narrow the achievement gap in education. Retrieved 23 March 2014, from http://mobileworldcapital.com/en/article/78

Teoh, L. (2011). Mobile Stats 2011: 91% Use Mobile Phone to Socialize. Retrieved 16 March 2014, from http://www.biztechday.com/mobile-stats-2011-91-use-mobile-phones-to-socialize/

WHO. (2011). Mobile Phone Use: A Growing Problem of Driver Distraction. Journal of WHO , pp. 1-50.

How Sleeping Hours Affect Students’ Studies

STATISTICAL TECHNIQUES FOR BEHAVIORAL SCIENCE I

HOW SLEEPING HOURS AFFECT STUDENTS STUDIES IN UTAR PERAK CAMPUS

FACULTY OF ARTS AND SOCIAL SCIENCE

TITLE : HOW SLEEPING HOURS AFFECTS STUDENTS’ STUDIES IN UTAR PERAK CAMPUS

Marks

1.

Abstract

/ 10 marks

2.

Chapter 1 Introduction

/ 10 marks

3.

Chapter 2 Literature Review

/ 15 marks

4.

Chapter 3 Method

/ 15 marks

5.

Chapter 4 Data Analysis and Result

/ 25 marks

6.

Chapter 5 Conclusion

/ 15 marks

7.

References

/ 10 marks

8.

Penalty for _____________________ (if any)

Total

/100 marks

Index

Title

Page

1.

Abstract

Sleep deprivation and poor sleep quality affect the study performances of students. The purpose of this statistical study is to determine whether the amount of sleeping hours affect the students’ studies of UTAR Perak Campus. It is hypothesized that participants who have lower sleep deprivation and higher sleep quality will perform better in their studies than those who experience higher sleep deprivation and lower sleep quality.

Introduction

According to Gilbert and Weaver (2010), human bodies require not only basic needs of air, water and food to function well but also sufficient sleep as it is important for learning, memory consolidation, critical thinking and decision making. For optimal functioning in academic, sleep is essential.

Sleep deprivation is now widely recognized as one of the significant public health issues not only among students but people of all ages and groups. Some shows excessive sleepiness and this is then related to not the quantity of sleep obtained but the quality of sleep. (Gilbert & Weaver, 2010)

Both sleep deprivation and poor sleep quality are prominent among students because often they have irregular sleep patterns due to the workloads from their study schedule and also clubs’ activities. This resulted in them having short sleep lengths in the weekdays and also later wake-up time on weekends. (Gilbert & Weaver, 2010)

It is recognized by university psychologists that student academic performance is being negatively affected by poor sleep quality and/or sleep deprivation. Though depression is also one of the factor that affects academic performances of students, sleep quality may even be more significant of a factor than depression in affecting students’ studies. (Gilbert & Weaver, 2010)

It is found that impact of sleepiness on mood is large as higher negative mood states are being reported by students who fell asleep during class.

Research Questions

Will sleeping hours affect the academic performance of students of UTAR Perak Campus?

Researchers want to find out how different amount of sleeping hours affect the studies of students.

What are the factors that affect the quantity of students’ sleeping hours ?

Researchers are interested in finding factors which will affect both the quality and quantity of students’ sleeping hours which will then leads to affecting the students’ studies.

Will a student’s sleeping habit being influenced by friends and family?

Researchers are keen to know to the extent of how friends and family will affect a student’s sleeping habit.

How many hours of sleep do the male and female students need per day ?

Researchers want to study about the amount of sufficient sleeping hours required by female and male students.

What are the differences in CGPA scores of both male and female students according to the amount of sleeping hours they have ?

The researchers are keen to study the differences in CGPA score obtained by both genders of students according to the amount of sleeping hours they have.

Literature Review

Sleep is very important to a human being’s health. The consequences of sleep manifest in both health and performance. The relationships between sleep and performance have been studied in many different fields including human science, medicine, psychology, education, and business and etc. Sleep-related variables for instance sleep deficiency, sleep quality, sleep habits have been shown to influence the performance of students (Lack, 1986; Mulgrew et al., 2007; National Sleep Foundation, 2008; Pilcher & Huffcutt, 1996; Rosekind et al., 2010). According to Weitzman et al. (1981) , Delayed Sleep Phase Syndrome (DSPS) was defined into three big categories which are long sleep latency on weekdays (normally fall asleep between 2 a.m. to 6 a.m.), normal sleep length on weekends (usually sleep late and wake up late on weekends); and difficulty in staying asleep. These sleep problem is common and is present in students around the world.

Results indicates that in the U.S., 11.5% of undergraduate students were found to have DSPS (Brown, Soper, & Buboltz, 2001). Not only that , Australian studies found the prevalence of DSPS in students (17%) to be higher than in adults (6-7%) (Lack, 1986; Lack, Miller, & Turner, 1988). Studies related to DSPS have also been conducted in other countries such as Japan, Norway, and Taiwan (Hazama, Inoue, Kojima, Ueta, & Nakagome, 2008; Schrader, Bovim, & Sand, 1993; Yang, Wu, Hsieh, Liu, & Lu, 2003). Furthermore, In Lack’s (1986) study, the DSPS group experienced sleepiness on weekdays more often rather than the non-DSPS group. In addition, , it was found that members of the DSPS group performed at a lower level academically when compared with the non-DSPS group when course grades were examined . In a more recent study, Trockel et al. (2000) found that first-year college students with lower GPAs reported later bedtimes on weekdays and weekends and later wake-up times on weekdays and weekends.

On the other hand, the relationship between sleep and academic performance was reviewed in other studies . Approximately 103 studies related to sleep loss, learning capacity, and academic performance; samples were carried out among students in different university by Curcio, Ferrara, and Gennaro (2006). According to Curcio, Ferrara, and Gennaro (2006), sleep loss was negatively correlated with academic performance. Results indicates that sleep-deprived students performed poorly on learning capacity skills for instance attention, memory, and problem-solving tasks, and that the lack of sleep therefore indirectly affected their academic performance. Sleep deprivation is a term meaning loss of sleep Drummond and McKenna (2009) . Moreover, sleep loss resulted in daytime sleepiness that was also correlated with poor academic performance studies showed a significant relationship between lower GPA and lack of sleep among college students. The Multiple Sleep Latency Test was an instrument used to evaluate daytime sleepiness, which has been used by previous researchers (Carskadon, Harvey, & Dement, 1981; Fallone, Acebo, Arnedt, Seifer, & Carskadon, 2001; Randazzo, Muehlbach, Schweitzer, & Walsh, 1998).

On the other hand, another study was conducted to determine the various sleep patterns in medical students appearing in various ongoing professional at Shifa College of College of Medicine, Islamabed and to find out relationship between number of hours of sleep before an examination with academic performance in relevant exam. Majority of the students had reduced sleep in exam days and its reason was found to be studying late at night before paper and academic performance. There have been various reasons for decreased sleep in university and college students including watching TV and using the internet. A study done in a Pakistani medical university indicated that 58.9 per cent of the university and college students the most common cause of sleep deprivation was watching television and listening to music affects the students slept less than 8 hours a day . In addition, stress, in university and college students, is also a very essential contributing factor in inability to sleep at night. Consumption of caffeine , pain killers, substance abuse and smoking at night to keep them awake is also another trend seen among students. This greatly contributes to sleeplessness at night among students and affects their academic performances adversely. (Oshodi OY, Aina OF, Onajole AT, Omvik S, Pallesen S, Bjorvatn B, Thayer J, Nordhus H. Qureshi AU, Ali AS, Hafeez A, Ahmed TM ). Moreover, the study showed that students who achieved good grade (A, B) were those who slept for more than 7 hours but those who majority failed in exam, were also mostly those who slept less.

However, a similar study done in USA showed that students who had struggling grades (C’s D’s / F’s) were those who slept significantly less than those who scored A and B grades ( Wolfson AR, Carskadon MA). According to the study, student slept an average of only 4.74 hours before the exam and females slept less (4.71+1.82 hours) as compared to males (4.77+3.27 hours ). This was similar to a cross sectional study done in Sao Paulo, which showed that boys slept about 390 minutes. However, their academic performance was not affected by the disturbance in the sleep cycle.

Furthermore, another research was done with 103 samples of undergraduate classes of University of Minnesota.This study separate unhealthy sleep habits into two categories which are quality and quantity of sleep. In this study, the survey asked questions related to their sleep habits in term of quality and quantity of sleep which separately measured in order to break up the term of “unhealthy sleep habits” and analysed this topic by using a different method compare to past research. In this research, the researchers found that sleep quantity and academic performance are related. This research shown that have the relation between part of sleep deprivation measures for the average week and the average amount of sleep obtained in a night and GPA. This result helps the college student by practical applications. From this research, the researchers found that amount of sleep and academic success are positively correlated, although cannot conclude that sleep better, score better in exam.

Methodology

Participants of the Study

There are 50 participants in this study. They are degree students from University Tunku Abdul Rahman (Kampar) who are from faculty of art & social science (FAS), faculty of business and finance (FBF), faculty of information communication and technology (FICT), institute of Chinese studies (ICS) and faculty of science (FSC). Their age range is from 20-24. Besides, there are 5 male and 5 female from each faculty.

Instruments

Our questionnaire consist of 15 closed-ended questions and each question involved different level of measurement such as nominal, ordinal, interval and ration scale.Our demographic details are gender, age, courses of studies and CGPA comprised in the questionnaire.

‘Sleep measures’ consists measurement of Total Sleep Time (TST), Sleep Onset Latency (SOL), Sleep Efficiency (SE) and Wake After Sleep Onset (WASO). It is determined by the Cole-Kripke (1992). Total Sleep Time (TST), which is duration of time actigraphically-determined as “sleep” within a 24-hour period, including daytime and nighttime periods of sleep. Sleep Onset Latency (SOL), which is time used between getting into bed and falling asleep, calculated as the time used from the start of actigraphically-determined “inactivity” to the first minute scored as sleep. We only refer TST and SOL among the four measurements in our questionnaire (question 2 and question3).

To measure the sleep quality of students, we decided to use the Adult Sleep–Wake Scale (ADSWS). It is a self-report pencil-and-paper measure of sleep quality consisting of ¬?ve behavioral dimensions, which are Going to Bed, Falling Asleep, Maintaining Sleep, Reinitiating Sleep, and Returning to Wakefulness. The questionnaire consists of time taken to fall asleep at night (range from <10 minutes to >1 hour), the amount of sleeping hours required in order to function well on the following day (range from <5 hours to 8 hours), the factors affect the quality or quantity of student’s sleeping hours and the perception of academic performance influenced by insufficient sleep.

Procedure

We are curious about how sleeping hours affect students’ studies, so we came out our research questions. After that, we set our questionnaires and printed out for the participants. We select randomly 5 male and 5 female from each faculty. Our questionnaire also include inform consent for the participants. On average, each participant took about 10 to 15 minutes to complete our questionnaire. Once they completed their questionnaire, we collect the data immediately.

Data Analysis

Figure 1.Amount of sleeping hours affecting students’ studies

CGPA

0.00-1.99

2.00-2.19

2.20-2.99

3.00-3.49

3.50-4.0

<5hours

0

0

2

1

0

5-6hours

0

3

3

0

1

6-7hours

0

7

8

0

1

7-8hours

2

3

16

3

0

Figure 1 shows the CGPA scores obtained by students of UTAR Kampar campus according to the amount of sleeping hours they have. Students are likely to obtained highest CGPA score range of 3.50 to 4.0 when they obtained seven to eight hours of sleep per day.

Figure 2.Factors affecting the quality and quantity of students’ sleeping hours

Factors

% UTAR Kampar student (Total 50 students)

night owl

18

homework

34

friends

14

Co-curriculum

6

time management

14

Figure 2 shows the factors that affect the quality and quantity of students’ sleeping hours. 34% of the total of 50 students chose homework as the biggest factor that affects their sleeping hours whereas only 6% of them chose co-curriculum as the factor that is affecting their sleeping hours. Other factors include being night owls, socializing with friends and time management

Figure 3.Will a student’s sleeping habit being influenced by friends and family.

Sleeping habit being influenced by friends and family

male

female

yes

11

13

no

4

7

Figure 3 shows that 11 male and 13 female students stated that their sleeping hours were influenced by friends and family while 4 male and 7 female students stated that their sleeping hours were not being influenced by friends and family.

Figure 4.Amount of sleeping hours required between different genders of students

Sleeping hours

Male

Female

<5hours

2

1

5-6hours

4

3

6-7hours

6

10

7-8hours

13

11

Figure 4 shows that 13 male and 11 female students stated that they require seven to eight hours of sleep per day while only 2 male and 1 female students require less than five hours of sleep per day.

Figure 5 shows the CGPA score obtained by both male and female students according to the amount of sleeping hours per day.

Male

CGPA score range

Sleeping hours

0.00-1.99

2.00-2.19

2.20-2.99

3.00-3.49

3.50-4.0

<5hours

2

5-6hours

1

3

6-7hours

3

2

1

7-8hours

2

1

10

Female

CGPA score range

Sleeping hours

0.00-1.99

2.00-2.19

2.20-2.99

3.00-3.49

3.50-4.0

<5hours

1

5-6hours

2

1

6-7hours

4

6

7-8hours

2

6

3

Figure 5 shows that 10 male and 6 female students who had seven to eight hours of sleep per day score an average CGPA at the range of 2.20-2.99 while only 2 male students who had the same amount of sleeping hours score the lowest range of CGPA at 0.00-1.99. Only 1 male student who had less than five hours of sleep had a CGPA score range of 2.20-2.99 and 1 female student who had the same amount of sleeping hours had a CGPA score range of 3.00-3.49.

History of statistics and its significance

History of Statistics and its Significance

Statistics is a relatively new subject, which branched from Probability Theory and is widely used in areas such as Economics and Astrology. It is a logic and methodology to measure uncertainty and it is used to do inferences on these uncertainties (Stigler, 1986). The history of Statistics can be firstly traced back to the 1600’s. John Graunt (1620-1674) could be considered as the pioneer of statistics and as the author of the first book regarding statistics. He published Natural and Political observations on the Bills of Mortality in 1662 whereby he was studying the plague outbreak in London at the time requested by the King. Graunt was asked to come up with a system that would allow them to detect threats of further outbreaks, by keeping records of mortality and causes of death and making an estimation of the population. By forming the life table, Graunt discovered that ‘statistically’, the ratio of male to females are almost equal. Then in 1666, he collected data and started to examine life expectancies. All of this was fundamental as he was arguably the first to create a condensed life table from large data and was able to do some analysis on it. In addition, this is widely used in life insurance today, showing the importance and significance of Graunt’s work (Verduin, 2009). Another reason why this is significant is because of his ability in demonstrating the value of data collection (Stigler, 1986). Then in 1693, Edmond Halley extended Graunt’s ideas and formed the first mortality table that statistically made the relationship between age and death rates. Again, this is used in life insurance (Verduin, 2009).

Another contributor to the formation of statistics is Abraham De Moivre (1667-1823). He was the first person to identify the properties of the normal curve and in 1711, introduced the notion of statistical independence (Verduin, 2009). In 1724, De Moivre studied mortality statistics and laid down foundations of the theory of annuities, inspired by the work of Halley. This is significant as annuities are widely used in the Finance industry today, in particular, when forming actuarial tables in life insurance. De Moivre then went on to talk about the idea of the normal distribution which can be used to approximate the binomial distribution (O’Connor and Robertson, 2004).

William Playfair (1759-1823) was the person who invented statistical graphics, which included the line graph and the bar graph chart in 1786 and the pie chart in 1801. He believed that charts were a better way to represent data and he was “driven to this invention by a lack of data”. This was a milestone as these graphical representations are used everywhere today, the most notable being the time-series graph, which is a graph containing many data points measured at successive uniform intervals over a period of time. These graphs can be used to examine data such as shares, and could be used to predict future data (Robyn 1978).

Adolphe Quetlet (1796-1874) was the first person to apply probability and statistics to Social Sciences in 1835. He was interested in studying about human characteristics and suggested that the law of errors, which are commonly used in Astronomy, could be applied when studying people and through this, assumptions or predictions could be in regards to physical features and intellectual features of a person. Through Quetlet’s studies, he discovered that the distribution of certain characteristics when he made a diagram of it was in a shape of a bell curve. This was a significant discovery as Quetlet later went on to form properties of the normal distribution curve, which is a vital concept in Statistics today. Using this concept of “average man”, Quetlet used this to examine other social issues such as crime rates and marriage rates. He is also well known for the coming up with a formula called the Quetlet Index, or more commonly known as Body Mass Index, which is an indication or measure for obesity. This is still used today and you could find out your BMI by calculating. If you get an index of more than 30, it means the person is officially obese (O’Connor and Robertson, 2006).

Other members who made little but significance contributions to Statistics are Carl Gauss and Florence Nightingale. Gauss was the first person who played around with the least squares estimation method when he was interested in astronomy and attempted to predict the position of a planet. He later proved this method by assuming the errors are normally distributed. The method of least squares is widely used today, in Astronomy for example, in order to minimise the error and improve the accuracy of results or calculations (O’Connor and Robertson, 1996). It was also the most commonly used method before 1827 when trying to combine inconsistent equations (Stigler, 1986). Nightingale was inspired by Quetlet’s work on statistical graphics and produced a chart detailing the deaths of soldiers where she worked. She later went on to analyse that state and care of medical facilities in India. This was significant as Nightingale applied statistics to health problems and this led to the improvement of medical healthcare. Her important works were recognised as became the first female to be a member of the Royal Statistical Society (Cohen, 1984).

One of the greatest contributors was Francis Galton (1822-1911) who helped create a statistical revolution which laid foundations for future statisticians like Karl Pearson and Charles Spearman (Stigler, 1986). He was related to Charles Darwin and had many interests, such as Eugenics and Anthropology. He came up with a number of vital concepts, including the regression, standard deviation and correlation, which came about when Galton was studying sweet peas. He discovered that the successive sweet peas were of different sizes but regressed towards the mean size and the distribution of their parents (Gavan Tredoux, 2007). He later went on to work with the idea of correlation when he was studying the heights of parents and the parent’s children when they reach adulthood, where he made a diagram of his findings and found an obvious correlation between the two. He then performed a few other experiments and came to the conclusion that the index of the correlation was an indication to the degree in which the two variables were related to one another. His studies were significant as they are all fundamental in Statistics today and these methods are used in many areas for data analysis, especially with extracting meaningful information between different factors (O’Connor and Robertson, 2003).

The History of Statistics: The Measurement of Uncertainty before 1900

Stephen M Stigelr

Publisher: Belknap Press of Harvard University Press, March 1, 1990

p1, 4, 40, 266

http://www.leidenuniv.nl/fsw/verduin/stathist/stathist.htm

A short History of Probability and Statistics

Kees Verduin

Last Updated: March 2009

Last Accessed: 02/04/2010

http://www-history.mcs.st-and.ac.uk/Biographies/De_Moivre.html

The MacTutor History of Mathematics archive

Article by: J J O’Connor and E F Robertson

Copyright June 2004

Last Accessed: 05/04/2010

The American Statistician Volume: 32, No: 1

Quantitative graphics in statistics: A brief history

James R. Beniger and Dorothy L. Robyn

p1-11

http://www-groups.dcs.st-andrews.ac.uk/~history/Biographies/Quetelet.html

The MacTutor History of Mathematics archive

Article by: J J O’Connor and E F Robertson

Copyright August 2006

Last Accessed: 06/04/2010

http://www-history.mcs.st-and.ac.uk/Biographies/Gauss.html

The MacTutor History of Mathematics archive

Article by: J J O’Connor and E F Robertson

Copyright December 1996

Last Accessed: 06/04/2010

Scientific American 250

Florence Nightingale

I. Bernard Cohen

March 1984, p128-37/p98-107depending on country of sale

http://galton.org/

Francis Galton

Edited and Maintained by: Gavan Tredoux

Last Updated: 12/11/07 (according to the update in ‘News’ section)

Last Accessed: 07/04/2010

http://www-history.mcs.st-and.ac.uk/Biographies/Galton.html

The MacTutor History of Mathematics archive

Article by: J J O’Connor and E F Robertson

Copyright October 2003

Last Accessed: 07/04/2010

Statistics Essays – Histogram

A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution.

We use Excel to generate a Box to represents both the original and the corrected sets of data. The result is the following diagram:

The different methods of diagrammatic representation of statistical data are bar chat, histogram, steam and leaf, and lineplots. The bar chart is more appropriate to data from a discrete distribution that are summarised using a frequency distribution. A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution. A histogram is therefore similar to a bar chat, but is used to present continuous data. Steam and leaf gives a visual representation similar to the histogram but has the advantage that it does not lose the detail of the individual data point in the grouping. All these diagrams serve to examine the general shape of the distribution of data and help in making conjecture about values of quantities such as the median, the mean or the interquartile range. The last one, the lineplot, is often appropriate for smaller data sets, and can be useful for example to check whether toe data sets have a common variance.

We denote by and the mean of the original set and the corrected set respectively. Then we have:

i.e. .

i.e. .

Since we have an even number of observation, the median in this case will be the midpoint of the two middle observations. That’s:

For the original set the median is ;
For the corrected set the median is .

The standard deviation of each data set is given by , where , are the different values in each data set. Hence:

For the original set, , and for the corrected set .

The lower quartile is defined to be the th observation counting from below, and the upper quartile is the same but counting from above. The interquartile is simply the difference between the upper and the lower quartile. We have the results in the following table.

Original set

Corrected set

Lower quartile

3.815

3.7475

Upper quartile

3.3925

3.3925

Interquartile

0.4225

0.355

Question 2

Theoretically, the fact that 9 and 12 can be made up in as many ways as 10 and eleven 11 means that both sets of numbers should have the same probability to appear. The first thing that should be noted here is the fact that this is true if and only if when we throw a dice, all the numbers have the same probability of appearance, which if not always the case in practice when if when we need to allow consideration such as the on uniformity of the surface on which the dice is thrown, the angle and the velocity at which the is thrown, and even any deformation on the dice which all have an effect on the number that we will get. This problem thus highlights the impossibility of the probability to be an absolutely precise science as oppose to the other branches of mathematics.

Question 3

The probability that a film processed on machine X is . Also, the quality of a film is independent of the quality of all the films processed before it. Thus the probability that three films randomly chosen from a batch coming from machine X is simply .
Let’s denote by the event “the batch came from machine X”, the event “the three film are all of good quality”. Clearly, what we are asking for is the probability that and occur at the same time, which is the probability that the three films are all of good quality and the batch came from machine X. Using the theory of conditional probabilities, we have:
.

Since all of all films are processed on machine X, then . is simply the probability the probability that we calculated above. Thus . Hence:

.

Question 4

At each question only two things can happen:
1-the student can answer the correctly, and we denote by the probability that this does happen;
2-or the student can choose the wrong outcomes among the five possible, and we denote by the probability that this does happen.

Obviously we must have . Given that only five outcomes are available at each question, only one of which being correct, we have , and .

The experiment that consists in answering a single question can therefore be viewed as a Bernoulli experiment with parameter . Hence, Taking all the multiple-choice examination can be viewed as Binomial experiment with parameter , where . Let’s be the random variable representing the number of correct answer achieved by the student. Clearly, the distribution of Binomial with parameter . The probability that the student passes the test is the , which is equivalent to . But:

,

where for each , .

Hence,
.

This gives us , and thus the probability that the student passes the test is .

Question 5

Bayes’ Formula
Let E, F be subsets of some sample space S, and let Fc be the complement of F in S. We can express E as

because in order for a point to be in E it must be either in E and F or in E but not in F. As EF and EFc are mutually exclusive we can write
Applying this to the conditional probability equation gives
.
Consider the following problem:

We have three boxes labelled U1, U2 and U3. Each of them contains a mix of white and red balls. The proportion of white balls is each of them is as follows: 30% for U1, 60% for U2, 40% for U3.
We draw one ball from U1; if it is a white ball then we draw a ball in U2, otherwise we draw a ball in U3.
We would like to find the probability that the first draw gives a red ball knowing that the second draw has given a given a white ball.

We denote by the event “the second draw is made in the box Ui”, the event “the second draw gives a white ball”.

Clearly, if the first draw gives a red ball, then the second can be made only in U3. Thus the probability that the first draw gives a red ball knowing that the second draw has given a given a white one is exactly the same as the probability that the second ball comes from U3 knowing that it is a white ball, which is nothing else than . Using the Bayes’ formula, we have

. (1)

It can be easily seen that and are mutually exclusive as a the second draw can not happen in both U2 and U3 simultaneously. Also since the second draw can happen only either in U2 or U3, then gives all the possibility on where the second draw can happen. That is why
.

The top of the fraction (1) is simply application of the conditional probability.

Hence:

Gambling Addiction Literature Review

Literature Review

Chapter 2: Literature Review

2.1 Introduction

This chapter covers a review of past literatures pertaining to the topic under study. As an opening, it brings in the limelight the backbone of gambling. Several definitions about gambling and the rationale behind are put forward as described by several authors. Following this, the different types of gambling activities adopted by the university students are highlighted; namely poker, sports wagering and lotteries for example. Furthermore, gamblers response towards the gambling activities and their problems are reviewed and contrasted.

2.2 What is gambling?

Gambling is the wagering of money or something of material value (stakes) on an event with an uncertain outcome with the primary intent of winning additional money or goods. Three key elements in gambling are: Consideration, Chance and Prize (I. N. Rose, 2013).

McGill university review refers gambling as any game or activity in which you may risk money or a valuable object in order to win money.

The elements present in gambling are firstly that one needs to realize that by gambling, something valuable is being put at risk, secondly the outcome of the game is determined by chance and finally once a bet is made it is irreversible.

2.3 History of gambling:

Gambling is one of mankind’s oldest doings as indicated by writings and equipments found in tombs and other places. The foundation of gambling is considered to be divinatory by emitting marked sticks and other objects and inferring the upshot, man sought the understanding of future and the aims of gods.

Anthropologists have also pointed to the fact that gambling is more rampant in societies where there is an extensive belief in gods and spirits whose compassion may be sought. With the advent of legal gambling houses in the 17th century, mathematicians came to a decision to take a serious awareness in games of randomizing equipment, such as dice and cards, out of which grew the field of probability theory.

Organised approved sports betting dates back to the late 18th century where there was a swing in the official stance towards gambling, from considering it to a sin to considering it to a vice and a human weakness and lastly to seeing it as a mostly harmless and even entertaining activity.

By the start of the 21th century approximately four out of five people in western nations gambled at least every week.

2.4 Who is a gambler?

A person who wagers money on the outcome of games or sporting events can be categorized as a gambler. Gamblers can visit gambling houses, or through any other facility, to place their bets and hope for a win. There are three common types of gambler the social gambler, the professional gambler and the problem gambler. The professional gamblers are the rarest form of gambler and do not depend on luck but much more of games of skills to make an earning. They have full control over the money, time and energy they are spending on the game. The social gambler considers gambling to be a recreational activity and they maintain control of their betting, the energy and the time they spend on the game. They consider their betting to be a price to be paid for entertainment. Problem gambler involves the continuous involvement in gambling despite negative consequences and this can lead to other health and social problems.

2.5 Gambling across the globe

2.5.1 Gambling age

The gambling age across the globe varies greatly. In some countries and areas gambling is proscribed altogether, in others gambling is only authorized for foreigners. In some areas, everyone is allowed to play but the betting age requirement is not the same for citizens as for foreigners. An example of such a country is Portugal where foreigners are allowed to venture in all casinos at the age of 18, while citizens need to be 21 or 25 depending on the gaming house.

The most familiar gambling age across the sphere is 18 years and more than 50% of western countries have this gambling age. There are nonetheless abundance of examples of countries that have a superior limit, such as Greece and Germany. Germany is a good model of how thorny the question of gambling age really is as Germany, just like in the USA, has different ages in different states within the nation. Most German states require you to be 18 years old, but some have placed the age constraint at 21 years instead.

Generally speaking, one can see a trend of countries and states lowering the gambling age from the once dominating gambling age of 21 year to just 18 years. This trend has been going for quite some time and across large parts of the world.

2.5.2 Top of the world

Certain countries are, as a whole, hot ongambling. Measured in terms of loss per capita of adults, the two top nations containing the maximum loss stand head and shoulders above the world. Those two infamous gambling Mecca’s are Australia and Singapore (American Gaming Association, 2006).

The top five countries as to gambling losses per capita of the adult population comprise: Australia, Singapore, Ireland, Canada and Finland. The average net yearly per adult expenditure on gambling for these nations runs from $1,275 down to $540 (American Gaming Association, 2006).

2.6 Gambling in Mauritius:

It was recently declared that the Council of Ministers in Mauritius endorsed the resolution that bookmakers operating out of the Champ de Mars racecourse are now permitted to work only on Fridays and Saturdays. Till now they were allowed to take bets upon publication of the official program of races on Thursdays. The raison d’etre set for this decision is that it will smooth the progress of condensing the influence of gambling on the Mauritians.

Gambling has become part of the foundation of the Mauritian society over the years. This takes account of casino gambling, online gambling, horse race betting and the “loterie verte”. Althoughhorse racingis still a popular betting sport, the Lotto, since its preface on the 7th of November 2009 as the new national lottery, has exceeded it in standing. We just have to pay attention to the radio for a few minutes or take a glimpse at the billboards when driving on the public road to get to know about the jackpot for the coming draw. There are more than 500 counters across the island in supermarkets, petrol pumps, and shops facilitating customers to play the Lotto. Around 12 scratch cards have also been pioneered giving people the prospect of winning instant money. When people primarily used to place their hard-earned money on horses, now they are being ensnared into wasting it on the Lotto. A considerable number of people are already conquered by the “jackpot fever”, spending more than usual when the jackpot gets bigger.

2.7 Types of gambling:

Gambling is a vast world which compromises of many branches from which people try their luck in the hope to make more money or just for the thrill of the game. In Mauritius you can easily find casinos, gaming houses (which is smaller than a casino but offers the same service for middle class players) and shops where you gamble. Some of the available forms of gambling present on the island are:

2.7.1 The lottery. The ‘lottery verte’ and the Lotto are the most common and most profitable types of gambling for the government in Mauritius. The ‘Lottery verte’ is a monthly lottery where you have to buy the tickets at a retailer, which can be found everywhere, and you just have to wait for the end of the month to check your results and see if you have won. The prices of the tickets are Rs10 each and you are eligible to win prices ranging from Rs 100000 to Rs 10 million. On the other hand you have the Lotto which settled itself in Mauritius more recently and now it’s the new craze for Mauritian. The idea is that you have to select 6 numbers out of 40 (each number can be selected only once) and then you just have to go to any supermarkets or retailer to validate your 6 numbers. Each ticket cost Rs20 and you can play as much ticket you want. The lottotech, the company which runs the lotto, makes a public draw, on air, on the national channel every Saturday. The lotto is a lottery where you have a cumulative jackpot, that is if no one wins the jackpot this week then the other week they will add this to a new jackpot thus every time you have the chance of winning a bigger one if you lose, and this jackpot starts at Rs5 million and can go up to Rs70 million (biggest jackpot won till now).

2.7.2 Horse racing. Horse racing is anchored in our society for ages and it forms part nowadays in our cultural and historical heritage. It was introduced in Mauritius by the English before the independence and it is still going strong. In the beginning horse racing was more for fame and social status than for making money and gambling. Latter to make the horse industry run and thrive, the board introduced betting on the horse racing and this was also a good opportunity for government to get tax money. Horse racing is a huge event in Mauritius, every Saturday and on some special occasions on Sundays we have horse racing at the Champ de Mars which is the race tracks found in the capital Port Louis. Nowadays in every rural and urban area you can find bookmakers who will take your bets on the horses as from Friday and on racing days you have a huge crowd who converge to the Champ de Mars for the fun and in the hope of making money.

2.7.3 Casino. A casino is a facility which accommodates certain types of gambling activities such as slot machines, poker, blackjack, big or small, van lak, dice and roulette for example. Casinos are situated at strategic areas to lure more and more clients, such strategic areas might be near hotels, touristic attractions, or even a city or town which is well frequented by many people. In Mauritius you have many casinos or gaming houses, which are smaller casinos but still well frequented by the people, found in the urban areas such as Rose-Hill, Vacoas, Port-Louis and some touristic places such as Grand Baie. Most games played have mathematically-determinedoddsthat ensure the house has at all times an overall advantage over the players. This can be expressed more precisely by the notion ofexpected value, which is uniformly negative (from the player’s perspective). This advantage is called thehouse edge. This is why there is an adage “the house always wins” for the casinos. In Mauritius nowadays we can witness more and more casinos being offered a patent and opening their doors to the public. The government knows that this is a prolific market and if they can make gambling accessible to more tourists and people it will surely be an advantage to them since the casinos have to pay a huge tax and money to get their patent. We can see that several tournaments are being organized in Mauritius, such as the World Poker Tour National Mauritius, which lures people from all over Africa and the Indian Ocean to come to Mauritius just to play poker. The hotels now when they are advertising the island they also advertise casinos to get more tourists, a new clientele and a really good strategy that differs from other hotels as they are targeting more and more high class ‘gambling tourists’ and which is a very profitable market.

2.7.4 Scratch cards. This is the new craze among the Mauritian people. Scratch cards are simple and easily available across the whole island. The rule is simple just buy one and you have to scratch the opaque surface which concealed the information, if you get the required symbols you win, and the most attractive part of it is the opportunity to win instantly as compared to lottery where you have to wait for the draw and the prices at which they are sold and the prizes that you can get from it. Cards can range from Rs20 to Rs100 and prizes may vary from Rs200 000 to Rs1 million. The scratch cards are supervised by the Lottotech the same company which manage the Lotto in Mauritius.

2.7.5 Online gambling. Easy, availability, and affordable are the words usually associated with online gambling. Easy to log in on some betting sites, no account needs to be created and no fees to be paid. Availability because of the fact that it is all over the internet, you do not have to look far to find online gambling sites. Banner ads and pop-ups can be found on mostly every site which has a high level of traffic by people. It is affordable since some sites just let you bet for free and if you win then you have to cash in to be able to play, some allows you to choose how many you want to bet and give you live odds according to what is happening which cannot be found elsewhere. Online gambling targets most of the time teenagers, this is a strategy called ‘grooming’ whereby they make the teenagers feel acquainted with the attractiveness of the game so that when he becomes older he will still be a potential income earner and a player.

2.8 Gambling among university students

Gambling is omnipresent among university students as demonstrated through researches. The vast majority of students gamble without experiencing ill effects, yet almost 8% of university students may build up a gambling problem (Derevensky, J. L., & Gupta, R. (2007). Gambling was once an acceptable form of entertainment on campuses but with the new laws, it is now forbidden to participate in any kind of gambling activities, but still it can be found everywhere. However, the warning signs of developing a gambling problem are not brought forward, as is seen with other potentially addictive behaviors, such as drug use and alcohol consumption. With the swell in gambling venues, social recognition of gambling, and access to extensive and inexpensive means of gambling, it is not astounding that studies have found high rates of gambling linked adverse problems among college students.

2.9 Problem gambling

Problem gambling or ludomania is an urge to continuously gamble despite harmful negative consequences or a desire to stop. The prevalence of problem gambling has been evaluated at 7.8% among university students which is considerably high than the roughly 5% rate found among the general population (Blinn, Pike, Worthy, Jonkman, 2006). Students facing problem gambling illustrate many signs including isolating behavior, lowered academic performance, poor impulse control and displaying extreme overconfidence, and participating in other high risk behaviors such as bringing on alcohol, tobacco and marijuana use and risky sexual behavior (LaBrie, etal, 2003), (Goodie, A.S, 2005). Environmental factors also contribute to problem gambling. The surroundings of a student are a key factor in determining whether he is prone to problem gambling. If the students live in an area where gambling opportunities and social normative beliefs that are supportive of gambling activities are available, this increases the likelihood of gambling participation and of development of a gambling problem. Staffs that are conscious of environmental conditions that may contribute to problem gambling can develop policies to help these students (Wehner,M. 2007).

2.9.1 Gambling Addiction and Problem Gambling

Whether you wage on scratch cards, sports, poker, roulette, or slots, in a casino or online, problem gambling can sprain relationships, impede with work, and escort to fiscal cataclysm. You may even do things you never contemplate you would, like stealing money to gamble or reimburse your debts. You may believe you can’t stop but, with the right help, you can triumph over a gambling problem or compulsion and reclaim control of your life. The first step is recognizing and acknowledging the problem. Gambling dependence is occasionally referred to as the “hidden illness” because there are no apparent substantial signs or symptoms like there are in drug or alcohol addiction. Problem gamblers on average refute or minimize the problem. They also go to great lengths to bury their gambling habits. For example, problem gamblers regularly depart from their loved ones, sneak around, and lie about where they’ve been and what they’ve been up to (Jeanne Segal, Ph.D., Melinda Smith, M.A., and Lawrence Robinson, 2013).