Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
231 views

Фиктивные переменные

f = open('apartment.tsv', 'r') s = f.read().replace(',', '.') f.close() fnew = open('ap.tsv', 'w') fnew.write(s) fnew.close()
%r ap <- read.table('ap.tsv', header=TRUE, sep='\t') str(ap)
'data.frame': 100 obs. of 10 variables: $ X : int 1 2 3 4 5 6 7 8 9 10 ... $ Y : num 15.9 27 21.1 24.5 13.5 22.5 15.5 75.9 15.1 26 ... $ X1: int 1 3 2 4 1 2 3 4 1 2 ... $ X2: Factor w/ 4 levels "К","М","П","С": 4 1 4 4 1 1 4 3 1 1 ... $ X3: num 39 68.4 54.7 90 34.8 48 68.1 132 39 55.5 ... $ X4: num 20 40.5 28 64 16 29 44.4 89.6 20 35 ... $ X5: num 8.2 10.7 10.7 15 10.7 8 7.2 11 8.5 8 ... $ X6: int 0 0 0 0 0 1 0 1 0 0 ... $ X7: int 1 1 1 0 0 1 0 1 1 1 ... $ X8: Factor w/ 2 levels "В","Н": 2 2 2 1 2 1 1 2 2 1 ...
%r X8.f <- factor(ap$X8) dummy8 <- model.matrix(~X8.f) dummy8
(Intercept)X8.fН
111
211
311
410
511
610
710
811
911
1010
1110
1210
1311
1410
1511
1611
1710
1811
1911
2011
2110
2210
2311
2411
2511
2610
2710
2810
2911
3011
7111
7210
7310
7410
7511
7611
7711
7810
7910
8011
8110
8210
8310
8411
8511
8611
8710
8810
8910
9011
9111
9211
9310
9410
9510
9611
9711
9811
9911
10010
%r X2.f <- factor(ap$X2) dummy2 <- model.matrix(~X2.f) dummy2
(Intercept)X2.fМX2.fПX2.fС
11001
21000
31001
41001
51000
61000
71001
81010
91000
101000
111010
121000
131000
141100
151100
161001
171100
181000
191010
201100
211001
221000
231010
241010
251010
261100
271010
281000
291001
301010
711001
721100
731100
741010
751100
761010
771010
781001
791001
801000
811010
821001
831001
841100
851100
861010
871100
881001
891001
901010
911100
921010
931000
941001
951001
961100
971010
981100
991100
1001100
%r ap.lm <- lm(ap$Y ~ ap$X1 + ap$X3 + ap$X4 + ap$X5 + dummy8[,2]) summary(ap.lm)
Call: lm(formula = ap$Y ~ ap$X1 + ap$X3 + ap$X4 + ap$X5 + dummy8[, 2]) Residuals: Min 1Q Median 3Q Max -12.1940 -4.0914 0.2285 3.0462 15.8874 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.8049 1.8582 -1.509 0.1345 ap$X1 -1.9545 1.0562 -1.850 0.0674 . ap$X3 0.6377 0.1201 5.308 7.38e-07 *** ap$X4 -0.1911 0.1493 -1.280 0.2038 ap$X5 -0.3848 0.1973 -1.951 0.0541 . dummy8[, 2] 8.2635 1.2797 6.457 4.65e-09 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 5.633 on 94 degrees of freedom Multiple R-squared: 0.8406, Adjusted R-squared: 0.8321 F-statistic: 99.13 on 5 and 94 DF, p-value: < 2.2e-16