I would like to use the manufacturer's list for each product in Pandas in advance and identify and automatically input the manufacturer from the product details in the purchase list.

Asked 3 months ago, Updated 3 months ago, 7 views

I would like to do the following with Pandas.

DataFrame 1

Enter a description of the image here

  • DataFrame1 is a purchase list
  • A, B, C, and D are the names of the products.
  • % should be irrelevant characters and numbers

DataFrame 2

list item

<What do you want to do>
①Add the column "Marker" to DataFrame1.
②"From the ""Product_Description"" line of DataFrame 1, identify the product name listed in ""Product_Name"" of Dataaframe 2 from the list of unrelated characters and numbers
" ③Automatically enter the manufacturer's name for each item in the "Marker" line of DataFrame 1.

Enter a description of the image here

To do this, first 1) create a Marker dictionary using Product_Name as the key from DataFrame 2, 2) read "Product_Description" in DataFrame 1 line at a time, 3) search for any of the Maker dictionary keys in the read string, 4) put the corresponding Maker value in the Maker of DataFrame 1

So, I think I can do it, but I've just started studying Python, so I can't write the code well...
It would be very helpful if you could give me an answer for future reference>
Thank you for your cooperation.

python pandas

2022-09-30 13:47

1 Answers

The part that I corrected after receiving the comments in the answer to this article seems to realize the steps I want to take.
Pandas:using str.contains and map to find some substring and replace value in column

EDIT by comment:
It sees there is no match by dict, you can test it by sample:

df1=pd.DataFrame({'device_id':['ad', 'bs', 'cr'], 'b':[1,2,3]})    
df2 = pd.DataFrame({'url':['a', 'm', 'k'], 'category':['one', 'two', 'three']})    
# df2 = pd.DataFrame({'url': ['ar', 'm', 'k'], 'category': ['one', 'two', 'three']})    

d=df2.set_index('url')['category'].to_dict()
print(d)
{'k': 'three', 'a': 'one', 'm': 'two'}

df1['category'] = df1.device_id.apply(lambdax:pd.Series([vfork, vind.items()ifkinx])) 
print(df1)
   b device_id category
0 1 add one
12 bs NaN
23cr NaN

If you apply it to your question, you will see the following:
By the way, Marker is the typeo of Maker.
Also, in the third image, the order of onions and coffee is not supported.

import pandas as pd

df1 = pd.DataFrame({
  'Product_Description': 
    ['%%%carrot%%%',
      '%%%%%%Potato%%',
      '%%%% Onion%%%%',
      "%%% Coffee%%%%%"]
})

df2 = pd.DataFrame({
  "Product_Name": 
    ["Carrot",
      'Potatoes',
      "Onion",
      'Coffee',
  "Maker": 
    ["Company A",
      'Company B',
      'Company B',
      [Company C]
})

df2dict=df2.set_index('Product_Name')['Maker'].to_dict()
print(df2dict)

df1['Maker'] = df1.Product_Description.apply(lambdax:pd.Series([vfork, vindf2dict.items()ifkinx]))

print(df1)

Display Results
df2dict

{'Carrot': 'A Company', 'Potato': 'B Company', 'Onion Company': 'B Company', 'C Coffee Company': 'C Company'}

df1

Product_Description Maker
0%%%Carrot%%%%A Company
1%%%%%%Potato%%B Company
2%%%% Onion%%%%%B Company
3%% Coffee %%%%C Company


2022-09-30 13:47

If you have any answers or tips


© 2023 OneMinuteCode. All rights reserved.