Forums

After using pd.DataFrame my data value changed incorrectly

i have 3 array skinColorY, skinColorCr, skinColorCb, i get the data from reshaped image and its all fine. then i want to use my data to fit in kmeans system. so, i convert the data using pd.DataFrame. this is what my codes look like.

skinColorY = []
skinColorCr = []
skinColorCb = []

for index, pixel in enumerate(cluster_labels_crcb):
    if pixel == choosenCluster[0] :
        skinColorY.append(reshaped_y[index][0])
        skinColorCr.append(reshapedCrCb[index][0])
        skinColorCb.append(reshapedCrCb[index][1])
        counter += 1

data = {'y': skinColorY,
        'cr': skinColorCr,
        'cb': skinColorCb}

df = pd.DataFrame(data, columns=['y', 'cr', 'cb'])

kmeans = KMeans(n_clusters=1).fit(df)
centroids = kmeans.cluster_centers_

so i check my data before convert using pd.DataFrame. i export it to workbook like this and i get the result that the data is correct (i compare it with my data that i run in my local python)

workbook = xlsxwriter.Workbook("./mysite/xls/"+filename+".xlsx")
worksheet = workbook.add_worksheet()
row = 0
for col, data in enumerate([skinColorY, skinColorCr, skinColorCb]):
    worksheet.write_column(row, col, data)
workbook.close()

workbook = xlsxwriter.Workbook("./mysite/xls/data"+filename+".xlsx")
worksheet = workbook.add_worksheet()
row = 0
for col, data in enumerate([data['y'], data['cr'], data['cb']]):
    worksheet.write_column(row, col, data)
workbook.close()

here is the result i get, left is my data from local, and right is data from online, and there is nothing difference between them

compare data array (skinColorY, skinColorCr, skinColorCb)

compare variable "data" (data['y'], data['cr'], data['cb'])

but when i convert it using pd.DataFrame the data is changed incorrectly. i check it by export to excel

df = pd.DataFrame(data, columns=['y', 'cr', 'cb'])
df.to_excel(excel_writer = "./mysite/xls/test"+filename+".xlsx")

here is the result i get. left is my data from local, and right is data from online. it looks so different for Y and Cr column, but the Cb column is correct. i don't know why it can change because i run the same code in my local but its fine.

compare data after convert using pd.DataFrame

maybe there is a way that I need to do, so the data not change? thankyou for the help..

That's beyond my knowledge of using Pandas; I see that you posted your question on Stack Overflow too, so perhaps someone with more Pandas knowledge will be able to help there.