[Answer]-"PANDAS & glob - Excel file format cannot be determined, you must specify an engine manually"


To solve the error "Excel file format cannot be determined, you must specify an engine manually" when using Pandas and glob to read Excel files, follow these steps:

  1. Install Required Libraries: Ensure that you have Pandas installed in your Python environment.
  2. Check File Extensions: Ensure that all the Excel files you're trying to read have valid file extensions (.xls or .xlsx).
  3. Specify Engine Manually: When reading Excel files using Pandas, explicitly specify the engine parameter as 'openpyxl' or 'xlrd' depending on the version of Excel files you are working with.
  4. Use glob to Read Multiple Files: If you're using glob to read multiple Excel files, ensure that you pass the correct file paths to the read_excel function and specify the engine parameter as shown above.
  5. Upgrade Pandas and Dependencies: If you're still encountering issues, consider upgrading Pandas and its dependencies to the latest versions.
  6. Check File Integrity: Make sure that the Excel files you're trying to read are not corrupted or damaged.

When an Excel file is opened, for example, by MS Excel, a hidden temporary file is created in the same directory:

~$datasheet.xlsx

So, when I run the code to read all the files from the folder, it gives me the error:

Excel file format cannot be determined; you must specify an engine manually.

When all files are closed and no hidden temporary files like ~$filename.xlsx are present in the same directory, the code works perfectly.