Statistics and Machine Learning Toolbox

Statistics and Machine Learning Toolbox

Estimate Category Probabilities for Nominal Responses

Estimate Upper and Lower Error Bounds for Probability Estimates of Ordinal Responses

Estimate Category Counts and Error Bounds for Nominal Responses

Translated byMouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated byMathWorks.Please click hereTo view all translated materials including this page, select Japan from the country navigator on the bottom of this page.

The automated translation of this page is provided by a general purpose third party translator tool.

MathWorks does not warrant, and disclaims all liability for, the accuracy, suitability, or fitness for purpose of the translation.

Multinomial logistic regression values

[pihat,dlow,dhi] = mnrval(B,X,stats)

[pihat,dlow,dhi] = mnrval(B,X,stats,Name,Value)

[yhat,dlow,dhi] = mnrval(B,X,ssize,stats)

[yhat,dlow,dhi] = mnrval(B,X,ssize,stats,Name,Value)

pihat= mnrval(B,X)returns the predicted probabilities for the multinomial logistic regression model with predictors,X, and the coefficient estimates,B.

pihatis ann-by-kmatrix of predicted probabilities for each multinomial category.Bis the vector or matrix that contains the coefficient estimates returned bymnrfit. AndXis ann-by-pmatrix which containsnobservations forppredictors.

mnrvalautomatically includes a constant term in all models. Do not enter a column of 1s inX.

[pihat,dlow,dhi] = mnrval(B,X,stats)also returns 95% error bounds on the predicted probabilities,pihat, using the statistics in the structure,stats, returned bymnrfit.

The lower and upper confidence bounds forpihatarepihatminusdlowandpihatplusdhi, respectively. Confidence bounds are nonsimultaneous and only apply to the fitted curve, not to new observations.

[pihat,dlow,dhi] = mnrval(B,X,stats,Name,Value)returns the predicted probabilities and 95% error bounds on the predicted probabilitiespihat, with additional options specified by one or moreName,Valuepair arguments.

For example, you can specify the model type, link function, and the type of probabilities to return.

yhat= mnrval(B,X,ssize)returns the predicted category counts for sample sizes,ssize.

[yhat,dlow,dhi] = mnrval(B,X,ssize,stats)also computes 95% error bounds on the predicted countsyhat, using the statistics in the structure,stats, returned bymnrfit.

The lower and upper confidence bounds foryhatareyhatminusdloandyhatplusdhi, respectively. Confidence bounds are nonsimultaneous and they apply to the fitted curve, not to new observations.

[yhat,dlow,dhi] = mnrval(B,X,ssize,stats,Name,Value)returns the predicted category counts and 95% error bounds on the predicted countsyhat, with additional options specified by one or moreName,Valuepair arguments.

For example, you can specify the model type, link function, and the type of predicted counts to return.

Fit a multinomial regression for nominal outcomes and estimate the category probabilities.

The column vector,species, consists of iris flowers of three different species, setosa, versicolor, virginica. The double matrixmeasconsists of four types of measurements on the flowers, the length and width of sepals and petals in centimeters, respectively.

Define the nominal response variable.

Now insp, 1, 2, and 3 indicate the species setosa, versicolor, and virginica, respectively.

Fit a nominal model to estimate the species using the flower measurements as the predictor variables.

Estimate the probability of being a certain kind of species for an iris flower having the measurements (6.3, 2.8, 4.9, 1.7).

The probability of an iris flower having the measurements (6.3, 2.8, 4.9, 1.7) being a setosa is 0, a versicolor is 0.3977, and a virginica is 0.6023.

Fit a multinomial regression model for categorical responses with natural ordering among categories. Then estimate the upper and lower confidence bounds for the category probability estimates.

Load the sample data and define the predictor variables.

The predictor variables are the acceleration, engine displacement, horsepower, and the weight of the cars. The response variable is miles per gallon (MPG).

Create an ordinal response variable categorizingMPGinto four levels from 9 to 48 mpg.

Now in miles, 1 indicates the cars with miles per gallon from 9 to 19, and 2 indicates the cars with miles per gallon from 20 to 29. Similarly, 3 and 4 indicate the cars with miles per gallon from 30 to 39 and 40 to 48, respectively.

Fit a multinomial regression model for the response variablemiles. For an ordinal model, the defaultlinkislogitand the defaultinteractionsisoff.

Compute the probability estimates and 95% error bounds for probability confidence intervals for miles per gallon of a car with= (12, 113, 110, 2670).

Calculate the confidence bounds for the category probability estimates.

LL = pihat – dlow; UL = pihat + hi; [LL;UL]

ans = 0.0073 0.7829 0.0283 -0.0003 0.1157 0.9022 0.1580 0.0057

Estimate Category Counts and Error Bounds for Nominal Responses

Fit a multinomial regression for nominal outcomes and estimate the category counts.

The column vector,species, consists of iris flowers of three different species, setosa, versicolor, and virginica. The double matrixmeasconsists of four types of measurements on the flowers, the length and width of sepals and petals in centimeters, respectively.

Define the nominal response variable.

Now insp, 1, 2, and 3 indicate the species setosa, versicolor, and virginica, respectively.

Fit a nominal model to estimate the species based on the flower measurements.

Estimate the number in each species category for a sample of 100 iris flowers all with the measurements (6.3, 2.8, 4.9, 1.7).

Estimate the error bounds for the counts.

Calculate the confidence bounds for the category probability estimates.

Create sample data with one predictor variable and a categorical response variable with three categories.

There are observations on seven different values of the predictor variablex. The response variableYhas three categories and the data shows how many of the 25 individuals are in each category ofYfor each observation ofx. For example, whenxis -3, 1 of 25 individuals is observed in category 1, 11 observed in category 2, and 13 observed in category 3. Similarly, whenxis 1, 5 of the individuals are observed in category 1, 14 are observed in category 2, and 6 are observed in category 3.

Plot the number in each category versus thexvalues, on a stacked bar graph.

Fit a nominal model for the individual response category probabilities, with separate slopes on the single predictor variable,x, for each category.

The first row ofbetaHatOrdcontains the intercept terms for the first two response categories. The second row contains the slopes.mnrfitaccepts the third category as the reference category and hence assumes the coefficients for the third category are zero.

Compute the predicted probabilities for the three response categories.

The probability of being in the third category is simply 1 – P(= 1) – P(= 2).

Plot the estimated cumulative number in each category on the bar graph.

The cumulative probability for the third category is always 1.

Now, fit a parallel ordinal model for the cumulative response category probabilities, with a common slope on the single predictor variable,x, across all categories:

The first two elements ofbetaHatOrdare the intercept terms for the first two response categories. The last element ofbetaHatOrdis the common slope.

Compute the predicted cumulative probabilities for the first two response categories. The cumulative probability for the third category is always 1.

Plot the estimated cumulative number on the bar graph of the observed cumulative number.

Co

efficient estimates for the multinomial logistic regression model, specified as a vector or matrix returned bymnrfit. It is a vector or matrix depending on the model and interactions.

Example:B = mnrfit(X,y);pihat = mnrval(B,X)

Sample data on predictors, specified as ann-by-p.Xcontainsnobservations forppredictors.

mnrvalautomatically includes a constant term in all models. Do not enter a column of 1s inX.

Model statistics, specified as a structure returned bymnrfit. You must use thestatsinput argument inmnrvalto compute the lower and upper error bounds on the category probabilities and counts.

Example:[B,dev,stats] = mnrfit(X,y);[pihat,dlo,dhi] = mnrval(B,X,stats)

Sample sizes to return the number of items in response categories for each combination of the predictor variables, specified as ann-by-1 column vector of positive integers.

For example, for a response variable having three categories, if an observation of the number of individuals in each category isy1,y2, andy3, respectively, then the sample size,m, for that observation ism=y1+y2+y3.

If the sample sizes fornobservations are in vectorsample, then you can enter the sample sizes as follows.

Specify optional comma-separated pairs ofName,Valuearguments.Nameis the argument name andValueis the corresponding value.Namemust appear inside single quotes (). You can specify several name and value pair arguments in any order asName1,Value1,…,NameN,ValueN.

model,ordinal,link,probit,type,cumulative

returns the estimates for cumulative probabilities for an ordinal model with a probit link function.

Type of multinomial model fit bymnrfit, specified as the comma-separated pair consisting ofmodeland one of the following.

Default. Specify when there is no ordering among the response categories.

Specify when there is a natural ordering among the response categories.

Specify when the choice of response category is sequential.

Indicator for an interaction between the multinomial categories and coefficients in the model fit bymnrfit, specified as the comma-separated pair consisting ofinteractionsand one of the following.

Default for nominal and hierarchical models. Specify to fit a model with different intercepts and coefficients across categories.

Default for ordinal models. Specify to fit a model with different intercepts, but a common set of coefficients for the predictor variables, across all multinomial categories. This is often described as

Link functionmnrfituses for ordinal and hierarchical models, specified as the comma-separated pair consisting oflinkand one of the following.

The link function defines the relationship between response probabilities and the linear combination of predictors,X.

might be cumulative or conditional probabilities based on whether the model is for an ordinal or a sequential/nested response.

You cannot specify thelinkparameter for nominal models; these always use a multinomial logit link,

ln(jr)=j0+j1Xj1+j2Xj2++jpXjp,j=1,,k1,

wherestands for a categorical probability, andrcorresponds to the reference category,kis the total number of response categories,pis the number of predictor variables.mnrfituses the last category as the reference category for nominal models.

Type of probabilities or counts to estimate, specified as the comma-separated pair includingtypeand one of the following.

Default. Specify to return predictions and error bounds for the probabilities (or counts) of the

Specify to return predictions and confidence bounds for the cumulative probabilities (or counts) of the first

1 multinomial categories, as an

1) matrix. The predicted cumulative probability for the

Specify to return predictions and error bounds in terms of the first

1 conditional category probabilities (counts), i.e., the probability (count) for category

, and you supply the sample size argument

, the predicted counts at each row of

are conditioned on the corresponding element of

Confidence level for the error bounds, specified as the comma-separated pair consisting ofconfidenceand a scalar value in the range (0,1).

For example, for 99% error bounds, you can specify the confidence as follows:

Probability estimates for each multinomial category, returned as ann-by-(k 1) matrix, wherenis the number of observations, andkis the number of response categories.

Count estimates for the number in each response category, returned as ann-by-k 1 matrix, wherenis the number of observations, andkis the number of response categories.

Lower error bound to compute the lower confidence bound forpihatoryhat, returned as a column vector.

The lower confidence bound forpihatispihatminusdlow. Similarly, the lower confidence bound foryhatisyhatminusdlow. Confidence bounds are nonsimultaneous and only apply to the fitted curve, not to new observations.

Upper error bound to compute the upper confidence bound forpihatoryhat, returned as a column vector.

The upper confidence bound forpihatispihatplusdhi. Similarly, the upper confidence bound foryhatisyhatplusdhi. Confidence bounds are nonsimultaneous and only apply to the fitted curve, not to new observations.

[1] McCullagh, P., and J. A. Nelder.Generalized Linear Models. New York: Chapman & Hall, 1990.

Web MATLAB MATLAB

Choose your country to get translated content where available and see local events and offers. Based on your location, we recommend that you select:.

You can also select a location from the following list:

Accelerating the pace of engineering and science

MathWorks