有一个txt格式的数据集
<TICKER>,<PER>,<DATE>,<TIME>,<OPEN>,<HIGH>,<LOW>,<CLOSE>,<VOL>,<OI>
EURUSD,5,20180307,080500,1.24210,1.24219,1.24201,1.24214,117,0
EURUSD,5,20180307,081000,1.24217,1.24249,1.24212,1.24236,165,0
EURUSD,5,20180307,081500,1.24235,1.24279,1.24232,1.24259,251,0
EURUSD,5,20180307,082000,1.24260,1.24273,1.24238,1.24248,196,0
EURUSD,5,20180307,082500,1.24247,1.24262,1.24241,1.24259,173,0
EURUSD,5,20180307,083000,1.24257,1.24310,1.24242,1.24302,281,0
EURUSD,5,20180307,083500,1.24298,1.24327,1.24291,1.24310,204,0
итд
在“时间”行中,HHMMSS 格式应该有6位数字,但是当我尝试在“R”中读取文件时,时间读取不正确,文件的“上限”也是
d <- read.table(file = "C:/Users/TARAS/Desktop/OHLC.txt",header = T,sep = ",")
head(d)
X.TICKER. X.PER. X.DATE. X.TIME. X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL. X.OI.
1 EURUSD 5 20180307 80500 1.24210 1.24219 1.24201 1.24214 117 0
2 EURUSD 5 20180307 81000 1.24217 1.24249 1.24212 1.24236 165 0
3 EURUSD 5 20180307 81500 1.24235 1.24279 1.24232 1.24259 251 0
4 EURUSD 5 20180307 82000 1.24260 1.24273 1.24238 1.24248 196 0
5 EURUSD 5 20180307 82500 1.24247 1.24262 1.24241 1.24259 173 0
6 EURUSD 5 20180307 83000 1.24257 1.24310 1.24242 1.24302 281
如您所见,格式已更改为“HMMSS”
有时它会像这样发生
188 EURUSD 5 20180307 234000 1.24125 1.24137 1.24125 1.24134 45 0
189 EURUSD 5 20180307 234500 1.24130 1.24130 1.24111 1.24116 81 0
190 EURUSD 5 20180307 235000 1.24102 1.24115 1.24095 1.24096 89 0
191 EURUSD 5 20180307 235500 1.24097 1.24105 1.24092 1.24092 42 0
192 EURUSD 5 20180308 0 1.24091 1.24115 1.24091 1.24104 55 0
193 EURUSD 5 20180308 500 1.24103 1.24109 1.24102 1.24107 45 0
194 EURUSD 5 20180308 1000 1.24106 1.24107 1.24103 1.24105 37 0
195 EURUSD 5 20180308 1500 1.24106 1.24109 1.24100 1.24100 20 0
196 EURUSD 5 20180308 2000 1.24099 1.24102 1.24097 1.24098 21 0
197 EURUSD 5 20180308 2500 1.24099 1.24101 1.24096 1.24097 36 0
198 EURUSD 5 20180308 3000 1.24096 1.24110 1.24087 1.24109 81 0
199 EURUSD 5 20180308 3500 1.24108 1.24110 1.24106 1.24107 31
如何解决?让我提醒您,.txt 文件中的所有内容都是正确的
您可以为函数
read.table
指定参数colClasses
,即 每列中显示的数据类型:这里我们为前 4 列指定了数据类型,分别是字符串、数字、数字、字符串。现在 TIME 列中的数据将被视为字符串,不会丢弃第一个零:
此外,您应该像使用线条一样使用这些数据。